Synchronization operations like barriers are fre-quently seen in parallel OpenMP programs, where an inefficient implementation can severely limit the application performance. While synchronization optimization has been heavily studied on traditional x86 architectures, there is no consensus on how synchronization can be best implemented on the ARMv8 multi-core CPUs. This paper presents a study of OpenMP synchronization implementation on two representative ARMv8 multi-core architectures, Phytium 2000+ and ThunderX2, by considering various OpenMP synchronization mechanisms offered by two mainstreamed OpenMP compilers, GCC and LLVM. Our evalu-ation compares the performance, overhead and scalability of both compiler implementations. We show that...
Abstract—Barrier synchronization is a key programming primitive for shared memory embedded MPSoCs. A...
The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming ...
Tasking is the most significant feature included in the new OpenMP 3.0 standard. It was introduced t...
OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be ...
International audienceSynchronization mechanisms have been central issues in the race toward the com...
International audienceSynchronization mechanisms have been a critical issue in the race toward the c...
To simplify program development for the Singlechip Cloud Computer (SCC) it is desirable to have high...
International audienceIn [8], we demonstrated that contrary to sequential applications, parallel Ope...
Exascale systems will exhibit much higher degrees of parallelism both in terms of the number of node...
We have developed compiler optimization techniques for explicit parallel programs using the OpenMP A...
OpenMP is an application programmer interface that provides a parallel program- ming model that has ...
Multi-core architectures have become more popular due to better performance, reduced heat dissipatio...
In order to improve its expressivity with respect to unstructured parallelism, OpenMP 3.0 introduced...
High parallelism of MPSoC applications increase the need of optimization for the synchronization mec...
The Cray XMT architecture has incited curiosity among computer architects and system software design...
Abstract—Barrier synchronization is a key programming primitive for shared memory embedded MPSoCs. A...
The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming ...
Tasking is the most significant feature included in the new OpenMP 3.0 standard. It was introduced t...
OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be ...
International audienceSynchronization mechanisms have been central issues in the race toward the com...
International audienceSynchronization mechanisms have been a critical issue in the race toward the c...
To simplify program development for the Singlechip Cloud Computer (SCC) it is desirable to have high...
International audienceIn [8], we demonstrated that contrary to sequential applications, parallel Ope...
Exascale systems will exhibit much higher degrees of parallelism both in terms of the number of node...
We have developed compiler optimization techniques for explicit parallel programs using the OpenMP A...
OpenMP is an application programmer interface that provides a parallel program- ming model that has ...
Multi-core architectures have become more popular due to better performance, reduced heat dissipatio...
In order to improve its expressivity with respect to unstructured parallelism, OpenMP 3.0 introduced...
High parallelism of MPSoC applications increase the need of optimization for the synchronization mec...
The Cray XMT architecture has incited curiosity among computer architects and system software design...
Abstract—Barrier synchronization is a key programming primitive for shared memory embedded MPSoCs. A...
The OpenMP Application Programming Interface (API) is an emerging standard for parallel programming ...
Tasking is the most significant feature included in the new OpenMP 3.0 standard. It was introduced t...