The increasing demand in HPC to utilize accelerators has motivated the development of pragma-based directives to target these devices. OmpSs-2 and OpenACC are both directive-based solutions that allow application programmers to utilize accelerators. The two leverage distinct types of parallelism: task parallelism and data parallelism, respectively. Non-trivial scientific applications can benefit from both types of available parallelism. However, the combination of pragma-based models is difficult to coordinate, as both assume full control and are unaware of each other at runtime. We propose an interoperation mechanism to enable novel composability across pragma-based programming models. We study and propose a clear separation of duties and ...
Accelerators have been deployed on most major HPC systems. They are considered to improve the perfor...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...
Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the prod...
Current trends in High Performance Computing suggest a significant shift towards heterogeneous archi...
CUDA and OpenCL are the most widely used programming models to exploit hardware accelerators. Both p...
Task-based parallel programming models based on compiler directives have proved their effectiveness ...
OpenMP includes in its latest 4.0 specification the accelerator model. In this paper we present a pa...
The use of GPU accelerators is becoming common in HPC platforms due to the their effective performan...
Programming models for task-based parallelization based on compile-time directives are very effectiv...
Abstract- Twenty-first century parallel programming models are becoming real complex due to the dive...
HPC machines in the race for exascale computing are more heterogeneous than ever. The complexity of ...
The advent of heterogeneous computing has forced programmers to use platform specific programming pa...
The need for features for managing complex data accesses in modern programming models has increased ...
This work was supported by MEEP project, which has received funding from the European High-Performan...
OpenMP has been for many years the most widely used programming model for shared memory architecture...
Accelerators have been deployed on most major HPC systems. They are considered to improve the perfor...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...
Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the prod...
Current trends in High Performance Computing suggest a significant shift towards heterogeneous archi...
CUDA and OpenCL are the most widely used programming models to exploit hardware accelerators. Both p...
Task-based parallel programming models based on compiler directives have proved their effectiveness ...
OpenMP includes in its latest 4.0 specification the accelerator model. In this paper we present a pa...
The use of GPU accelerators is becoming common in HPC platforms due to the their effective performan...
Programming models for task-based parallelization based on compile-time directives are very effectiv...
Abstract- Twenty-first century parallel programming models are becoming real complex due to the dive...
HPC machines in the race for exascale computing are more heterogeneous than ever. The complexity of ...
The advent of heterogeneous computing has forced programmers to use platform specific programming pa...
The need for features for managing complex data accesses in modern programming models has increased ...
This work was supported by MEEP project, which has received funding from the European High-Performan...
OpenMP has been for many years the most widely used programming model for shared memory architecture...
Accelerators have been deployed on most major HPC systems. They are considered to improve the perfor...
State-of-the-art programming approaches generally have a strict division between intra-node shared m...
Clusters of SMPs are ubiquitous. They have been traditionally programmed by using MPI. But, the prod...