Compiler-based auto-parallelization is a much-studied area but has yet to find widespread application. This is largely due to the poor identification and exploitation of application parallelism, resulting in disappointing performance far below that which a skilled expert programmer could achieve. We have identified two weaknesses in traditional parallelizing compilers and propose a novel, integrated approach resulting in significant performance improvements of the generated parallel code. Using profile-driven parallelism detection, we overcome the limitations of static analysis, enabling the identification of more application parallelism, and only rely on the user for final approval. We then replace the traditional target-specific and infle...
With the rise of Chip multiprocessors (CMPs), the amount of parallel computing power will increase s...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The performance of many parallel applications relies not on instruction-level parallelism but on loo...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
The efficient mapping of program parallelism to multi-core processors is highly dependent on the und...
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering ...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow....
Parallelization is a technique that boosts the performance of a program beyond optimizations of the ...
In today’s multicore era, parallelization of serial code is essential in order to exploit the archit...
Abstract—We investigate an automatic method for classifying which regions of sequential programs cou...
Single core designs and architectures have reached their limits due to heat and power walls. In orde...
Abstract—Performance growth of single-core processors has come to a halt in the past decade, but was...
Abstract—Recently parallel architectures have entered every area of computing, from multi-core proce...
With the rise of Chip multiprocessors (CMPs), the amount of parallel computing power will increase s...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The performance of many parallel applications relies not on instruction-level parallelism but on loo...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
Compiler-based auto-parallelization is a much studied area, yet has still not found wide-spread appl...
The efficient mapping of program parallelism to multi-core processors is highly dependent on the und...
Multi-core processors are now ubiquitous and are widely seen as the most viable means of delivering ...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow....
Parallelization is a technique that boosts the performance of a program beyond optimizations of the ...
In today’s multicore era, parallelization of serial code is essential in order to exploit the archit...
Abstract—We investigate an automatic method for classifying which regions of sequential programs cou...
Single core designs and architectures have reached their limits due to heat and power walls. In orde...
Abstract—Performance growth of single-core processors has come to a halt in the past decade, but was...
Abstract—Recently parallel architectures have entered every area of computing, from multi-core proce...
With the rise of Chip multiprocessors (CMPs), the amount of parallel computing power will increase s...
The end of Dennard scaling also brought an end to frequency scaling as a means to improve performanc...
The performance of many parallel applications relies not on instruction-level parallelism but on loo...