This paper presents a new approach for the detection of coarse-grain parallelism in loop nests that contain complex computations, including subscripted subscripts as well as conditional statements that introduce complex control flows at run-time. The approach is based on the recognition of the computational kernels calculated in a loop without considering the semantics of the code. The detection is carried out on top of the Gated Single Assignment (GSA) program representation at two different levels. First, the use-def chains between the statements that compose the strongly connected components (SCCs) of the GSA use-def chain graph are analyzed (intra-SCC analysis). As a result, the kernel computed in each SCC is recognized. Second, the use...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Maximizing performance on modern multicore hardware demands aggressive optimizations. Large amountso...
The automatic parallelization of loops that contain complex computations is still a challenge for cu...
[Abstract] Summary form only given. The automatic parallelization of loops that contain complex comp...
Thesis (Ph. D.--University of Rochester. Dept. of Computer Science, 1991. Simultaneously published i...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow....
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
In recent years research in the area of parallel architectures and parallel languages has become mor...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Maximizing performance on modern multicore hardware demands aggressive optimizations. Large amountso...
The automatic parallelization of loops that contain complex computations is still a challenge for cu...
[Abstract] Summary form only given. The automatic parallelization of loops that contain complex comp...
Thesis (Ph. D.--University of Rochester. Dept. of Computer Science, 1991. Simultaneously published i...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Traditional static analysis fails to auto-parallelize programs with a complex control and data flow....
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
In recent years research in the area of parallel architectures and parallel languages has become mor...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Traditional parallelism detection in compilers is performed by means of static analysis and more spe...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Modern computers will increasingly rely on parallelism to achieve high computation rates. Techniques...
While the chip multiprocessor (CMP) has quickly become the predominant processor architecture, its c...
Maximizing performance on modern multicore hardware demands aggressive optimizations. Large amountso...