Traditional static analysis fails to auto-parallelize programs with a complex control and data flow. Furthermore, thread-level parallelism in such programs is often restricted to pipeline parallelism, which can be hard to discover by a programmer. In this paper we propose a tool that, based on profiling information, helps the programmer to discover parallelism. The programmer hand-picks the code transformations from among the proposed candidates which are then applied by automatic code transformation techniques. This paper contributes to the literature by presenting a profiling tool for discovering thread-level parallelism. We track dependencies at the whole-data structure level rather than at the element level or byte level in order to li...
Maximizing performance on modern multicore hardware demands aggressive optimizations. Large amountso...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
With the rise of Chip multiprocessors (CMPs), the amount of parallel computing power will increase s...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
With the rise of Chip multiprocessors (CMPs), the amount of parallel computing power will increase s...
In the era of multicore processors, the responsibility for performance gains has been shifted onto s...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2012.Speculative parallelizatio...
Thesis (Ph. D.--University of Rochester. Dept. of Computer Science, 1991. Simultaneously published i...
With the evolution of multi-core, multi-threaded processors from simple-scalar processors, the perfo...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
Parallelization is a technique that boosts the performance of a program beyond optimizations of the ...
Maximizing performance on modern multicore hardware demands aggressive optimizations. Large amountso...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
With the rise of Chip multiprocessors (CMPs), the amount of parallel computing power will increase s...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
With the rise of Chip multiprocessors (CMPs), the amount of parallel computing power will increase s...
In the era of multicore processors, the responsibility for performance gains has been shifted onto s...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2012.Speculative parallelizatio...
Thesis (Ph. D.--University of Rochester. Dept. of Computer Science, 1991. Simultaneously published i...
With the evolution of multi-core, multi-threaded processors from simple-scalar processors, the perfo...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
Parallelization is a technique that boosts the performance of a program beyond optimizations of the ...
Maximizing performance on modern multicore hardware demands aggressive optimizations. Large amountso...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...