Data dependences in sequential programs limit parallelization because extracted threads cannot run independently. Although thread-level speculation can avoid the need for precise dependence analysis, communication overheads required to synchronize actual dependences counteract the benefits of parallelization. To address these challenges, we propose a lightweight architectural enhancement co-designed with a parallelizing compiler, which together can decouple communication from thread execution. Simulations of these approaches, applied to a processor with 16 Intel Atom-like cores, show an average of 6.85 performance speedup for six SPEC CINT2000 benchmarks.This work was possible thanks to the sponsorship of the Royal Academy of Engineering, ...
Graduation date: 2009General purpose computer systems have seen increased performance potential thro...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
Parallel software is now required to exploit the abundance of threads and processors in modern multi...
Data dependences in sequential programs limit paralleliza-tion because extracted threads cannot run ...
We describe and evaluate HELIX, a new technique for automatic loop parallelization that assigns succ...
Improving system performance increasingly depends on exploiting microprocessor parallelism, yet main...
As classic Dennard process scaling fades into the past, power density concerns have driven modern CP...
Parallelism has become the primary way to maximize processor performance and power efficiency. But b...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2012.Speculative parallelizatio...
In this paper, we have presented the design and evalu-ation of a compiler system, called APE, f o r ...
Loops in scientific and engineering applications provide a rich source of parallelism. In order to o...
As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable t...
As we look to the future, and the prospect of a bil-lion transistors on a chip, it seems inevitable ...
Parallel hardware1 has become a ubiquitous component in computer processing technology. Uniprocessor...
grantor: University of TorontoTo fully exploit the potential of single-chip multiprocessor...
Graduation date: 2009General purpose computer systems have seen increased performance potential thro...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
Parallel software is now required to exploit the abundance of threads and processors in modern multi...
Data dependences in sequential programs limit paralleliza-tion because extracted threads cannot run ...
We describe and evaluate HELIX, a new technique for automatic loop parallelization that assigns succ...
Improving system performance increasingly depends on exploiting microprocessor parallelism, yet main...
As classic Dennard process scaling fades into the past, power density concerns have driven modern CP...
Parallelism has become the primary way to maximize processor performance and power efficiency. But b...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2012.Speculative parallelizatio...
In this paper, we have presented the design and evalu-ation of a compiler system, called APE, f o r ...
Loops in scientific and engineering applications provide a rich source of parallelism. In order to o...
As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable t...
As we look to the future, and the prospect of a bil-lion transistors on a chip, it seems inevitable ...
Parallel hardware1 has become a ubiquitous component in computer processing technology. Uniprocessor...
grantor: University of TorontoTo fully exploit the potential of single-chip multiprocessor...
Graduation date: 2009General purpose computer systems have seen increased performance potential thro...
Irregular applications have frequent data-dependent memory accesses and control flow. They arise in ...
Parallel software is now required to exploit the abundance of threads and processors in modern multi...