As classic Dennard process scaling fades into the past, power density concerns have driven modern CPU designs to de-emphasize the pursuit of single-thread performance, focusing instead on increasing the number of cores in a chip. Computing throughput on a modern chip continues to improve, since multiple programs can run in parallel, but the performance of single programs improves only incrementally. Many compilers have been designed to automatically parallelize sequentially written programs by leveraging multiple cores for the same task, thereby enabling continued single-thread performance gains. One such compiler is HELIX, which can increase the performance of a mixture of SPECfp and SPECint benchmarks by 2X on a 6-core Nehalem CPU. Pr...
University of Minnesota Ph.D. dissertation. June 2009. Major: Computer Science. Advisors: Prof. Pen-...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
We describe and evaluate HELIX, a new technique for automatic loop parallelization that assigns succ...
Data dependences in sequential programs limit parallelization because extracted threads cannot run i...
Parallelism has become the primary way to maximize processor performance and power efficiency. But b...
Data dependences in sequential programs limit paralleliza-tion because extracted threads cannot run ...
Improving system performance increasingly depends on exploiting microprocessor parallelism, yet main...
Graduation date: 2009General purpose computer systems have seen increased performance potential thro...
Faced with nearly stagnant clock speed advances, chip manufacturers have turned to parallelism as th...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
We present Outrider, an architecture for throughput-oriented processors that exploits intra-thread m...
Multicore systems have become the dominant mainstream computing platform. One of the biggest challen...
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering...
University of Minnesota Ph.D. dissertation. June 2009. Major: Computer Science. Advisors: Prof. Pen-...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
We describe and evaluate HELIX, a new technique for automatic loop parallelization that assigns succ...
Data dependences in sequential programs limit parallelization because extracted threads cannot run i...
Parallelism has become the primary way to maximize processor performance and power efficiency. But b...
Data dependences in sequential programs limit paralleliza-tion because extracted threads cannot run ...
Improving system performance increasingly depends on exploiting microprocessor parallelism, yet main...
Graduation date: 2009General purpose computer systems have seen increased performance potential thro...
Faced with nearly stagnant clock speed advances, chip manufacturers have turned to parallelism as th...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
Multi-core processors have become the dominant processor architecture with 2, 4, and 8 cores on a ch...
We present Outrider, an architecture for throughput-oriented processors that exploits intra-thread m...
Multicore systems have become the dominant mainstream computing platform. One of the biggest challen...
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering...
University of Minnesota Ph.D. dissertation. June 2009. Major: Computer Science. Advisors: Prof. Pen-...
Maximal utilization of cores in multicore architectures is key to realize the potential performance ...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...