Abstract: "Data-parallel programming languages have many desirable features, such as single-thread semantics and the ability to express fine-grained parallelism. However, it is challenging to implement such languages efficiently on conventional MIMD multiprocessors, because these machines incur a high overhead for small grain sizes. This paper presents compile-time analysis techniques for data-parallel program graphs that reduce these overheads in two ways: by stepping up the grain size, and by relaxing the synchronous nature of the computation without altering the program semantics.The algorithms partition the program graph into clusters of nodes such that all nodes in a cluster have the same loop structure, and futher refine these cluster...
A method for assessing the benefits of fine-grain paral-lelism in "real " programs is pres...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
. We present compiler optimization techniques for explicitly parallel programs that communicate thro...
226 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1993.Explicit parallelism not only...
Despite the performance potential of parallel systems, several factors have hindered their widesprea...
This paper describes a method of analysis for detecting and minimizing memory latency using a direct...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
For a wide variety of applications, both task and data parallelism must be exploited to achieve the ...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
It is possible to reduce the computation time of data parallel programs by dividing the computation ...
The goal of this research is to retarget multimedia programs written in sequential languages (e.g., ...
The goal of parallelizing, or restructuring, compilers is to detect and exploit parallelism in seque...
Data-parallel languages, such as H scIGH P scERFORMANCE F scORTRAN or F scORTRAN D, provide a machin...
A method for assessing the benefits of fine-grain paral-lelism in "real " programs is pres...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
. We present compiler optimization techniques for explicitly parallel programs that communicate thro...
226 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1993.Explicit parallelism not only...
Despite the performance potential of parallel systems, several factors have hindered their widesprea...
This paper describes a method of analysis for detecting and minimizing memory latency using a direct...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
For a wide variety of applications, both task and data parallelism must be exploited to achieve the ...
With the rise of chip-multiprocessors, the problem of parallelizing general-purpose programs has onc...
It is possible to reduce the computation time of data parallel programs by dividing the computation ...
The goal of this research is to retarget multimedia programs written in sequential languages (e.g., ...
The goal of parallelizing, or restructuring, compilers is to detect and exploit parallelism in seque...
Data-parallel languages, such as H scIGH P scERFORMANCE F scORTRAN or F scORTRAN D, provide a machin...
A method for assessing the benefits of fine-grain paral-lelism in "real " programs is pres...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...