Task-parallel languages are increasingly popular. Many of them provide expressive mechanisms for intertask synchronization. For example, OpenMP 4.0 will integrate data-driven execution semantics derived from the StarSs research language. Compared to the more restrictive data-parallel and fork-join concurrency models, the advanced features being introduced into task-parallelmodels in turn enable improved scalability through load balancing, memory latency hiding, mitigation of the pressure on memory bandwidth, and, as a side effect, reduced power consumption. In this article, we develop a systematic approach to compile loop nests into concurrent, dynamically constructed graphs of dependent tasks. We propose a simple and effective heuristic th...
We have developed compiler optimization techniques for explicit parallel programs using the OpenMP A...
In the foreseeable future, high-performance supercomputers will continue to evolve in the direction ...
Compute intensive applications running on clusters of shared-memory computers are typically implemen...
Modern parallel programming models perform their best under the particular patterns they are tuned t...
Parallel task-based programming models like OpenMP support the declaration of task data dependences....
Thus far, parallelism at the loop level (or data-parallelism) has been almost exclusively the main t...
The goal of parallelizing, or restructuring, compilers is to detect and exploit parallelism in seque...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
We present OpenStream, a data-flow extension of OpenMP to express dynamic dependent tasks. The lan-g...
Selected for presentation at the HiPEAC 2013 Conf.International audienceWe present OpenStream, a dat...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2012.Speculative parallelizatio...
Abstract. Complex embedded systems are designed under tight con-straints on response time, resource ...
International audienceDataflow models of computation have early on been acknowledged as an attractiv...
High-level abstractions for parallel programming simplify the development of efficient par-allel app...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
We have developed compiler optimization techniques for explicit parallel programs using the OpenMP A...
In the foreseeable future, high-performance supercomputers will continue to evolve in the direction ...
Compute intensive applications running on clusters of shared-memory computers are typically implemen...
Modern parallel programming models perform their best under the particular patterns they are tuned t...
Parallel task-based programming models like OpenMP support the declaration of task data dependences....
Thus far, parallelism at the loop level (or data-parallelism) has been almost exclusively the main t...
The goal of parallelizing, or restructuring, compilers is to detect and exploit parallelism in seque...
International audienceThis paper describes a tool using one or more executions of a sequential progr...
We present OpenStream, a data-flow extension of OpenMP to express dynamic dependent tasks. The lan-g...
Selected for presentation at the HiPEAC 2013 Conf.International audienceWe present OpenStream, a dat...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2012.Speculative parallelizatio...
Abstract. Complex embedded systems are designed under tight con-straints on response time, resource ...
International audienceDataflow models of computation have early on been acknowledged as an attractiv...
High-level abstractions for parallel programming simplify the development of efficient par-allel app...
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops becau...
We have developed compiler optimization techniques for explicit parallel programs using the OpenMP A...
In the foreseeable future, high-performance supercomputers will continue to evolve in the direction ...
Compute intensive applications running on clusters of shared-memory computers are typically implemen...