In chip multiprocessors (CMPs), limiting the number of off-chip cache misses is crucial for good performance. Many multithreaded programs provide opportunities for constructive cache sharing, in which concurrently scheduled threads share a largely overlapping working set. In this brief announcement, we highlight our ongoing study [4] comparing the performance of two schedulers designed for fine-grained multithreaded programs: Parallel Depth First (PDF) [2], which is designed for constructive sharing, and Work Stealing (WS) [3], which takes a more traditional approach.Overview of schedulers. In PDF, processing cores are allocated ready-to-execute program tasks such that higher scheduling priority is given to those tasks the sequential progra...
Current architectural trends of rising on-chip core counts and worsening power-performance penalties...
On the road to computer systems able to support the requirements of exascale applications, Chip Mult...
This paper presents a detailed study of fairness in cache sharing between threads in a chip multipro...
In chip multiprocessors (CMPs), limiting the number of offchip cache misses is crucial for good perf...
Computational task DAGs are executed on parallel computers by a task scheduling algorithm. Intellige...
The evolution of microprocessor design in the last few decades has changed significantly, moving fro...
We present a new operating system scheduling algorithm for multicore processors. Our algorithm reduc...
One of the critical problems associated with emerging chip multiprocessors (CMPs) is the management ...
CMPs allow threads to share portions of the on-chip cache. Critical to successful sharing are the p...
Exploitation of parallelism has for decades been central to the pursuit of computing performance. Th...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Microprocessor industry has converged on chip multiprocessor (CMP) as the architecture of choice to ...
Most parallel programs exhibit more parallelism than is available in processors pro-duced today. Whi...
Chip Multiprocessors (CMPs) are here to stay for the foreseeable future. In terms of programmability...
The limitation imposed by instruction-level parallelism (ILP) has motivated the use of thread-level ...
Current architectural trends of rising on-chip core counts and worsening power-performance penalties...
On the road to computer systems able to support the requirements of exascale applications, Chip Mult...
This paper presents a detailed study of fairness in cache sharing between threads in a chip multipro...
In chip multiprocessors (CMPs), limiting the number of offchip cache misses is crucial for good perf...
Computational task DAGs are executed on parallel computers by a task scheduling algorithm. Intellige...
The evolution of microprocessor design in the last few decades has changed significantly, moving fro...
We present a new operating system scheduling algorithm for multicore processors. Our algorithm reduc...
One of the critical problems associated with emerging chip multiprocessors (CMPs) is the management ...
CMPs allow threads to share portions of the on-chip cache. Critical to successful sharing are the p...
Exploitation of parallelism has for decades been central to the pursuit of computing performance. Th...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Microprocessor industry has converged on chip multiprocessor (CMP) as the architecture of choice to ...
Most parallel programs exhibit more parallelism than is available in processors pro-duced today. Whi...
Chip Multiprocessors (CMPs) are here to stay for the foreseeable future. In terms of programmability...
The limitation imposed by instruction-level parallelism (ILP) has motivated the use of thread-level ...
Current architectural trends of rising on-chip core counts and worsening power-performance penalties...
On the road to computer systems able to support the requirements of exascale applications, Chip Mult...
This paper presents a detailed study of fairness in cache sharing between threads in a chip multipro...