Reductions matter and they are here to stay. Wide adoption of parallel processing hardware in a broad range of computer applications has encouraged recent research efforts on their efficient parallelization. Furthermore, trends towards high productivity languages in mainstream computing increases the demand for efficient programming support. In this paper we present a new approach on parallel reductions for distributed memory systems that provides both scalability and programmability. Using OmpSs, a task-based parallel programming model, the developer has the ability to express scalable reductions through a single pragma annotation. This pragma annotation is applicable for tasks as well as for work-sharing constructs (with implicit tasking)...
Reductions are a well-known computational pattern found in scientific applications that needs effici...
Distributed Memory Multicomputers (DMMs) such as the IBM SP-2, the Intel Paragon and the Thinking Ma...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...
Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the prod...
Wide adoption of parallel processing hardware in mainstream computing as well as the interest for ef...
The need for features for managing complex data accesses in modern programming models has increased ...
© Springer International Publishing Switzerland 2014. The wide adoption of parallel processing hardw...
It has become common knowledge that parallel programming is needed for scientific applications, part...
The wide adoption of parallel processing hardware in mainstream computing as well as the raising int...
Abstract- Twenty-first century parallel programming models are becoming real complex due to the dive...
Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of ...
Reduction recognition and optimization are crucial techniques in parallelizing compilers. They are u...
Applications with high complexity, heavy computations and processing of large amount of data require...
As new heterogeneous systems and hardware accelerators appear, high performance computers can reach ...
This was a two-page overview of my NSF-funded project Supercomputing on a Cluster of Workstations v...
Reductions are a well-known computational pattern found in scientific applications that needs effici...
Distributed Memory Multicomputers (DMMs) such as the IBM SP-2, the Intel Paragon and the Thinking Ma...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...
Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the prod...
Wide adoption of parallel processing hardware in mainstream computing as well as the interest for ef...
The need for features for managing complex data accesses in modern programming models has increased ...
© Springer International Publishing Switzerland 2014. The wide adoption of parallel processing hardw...
It has become common knowledge that parallel programming is needed for scientific applications, part...
The wide adoption of parallel processing hardware in mainstream computing as well as the raising int...
Abstract- Twenty-first century parallel programming models are becoming real complex due to the dive...
Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of ...
Reduction recognition and optimization are crucial techniques in parallelizing compilers. They are u...
Applications with high complexity, heavy computations and processing of large amount of data require...
As new heterogeneous systems and hardware accelerators appear, high performance computers can reach ...
This was a two-page overview of my NSF-funded project Supercomputing on a Cluster of Workstations v...
Reductions are a well-known computational pattern found in scientific applications that needs effici...
Distributed Memory Multicomputers (DMMs) such as the IBM SP-2, the Intel Paragon and the Thinking Ma...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...