Abstract—We present and analyze two new communication libraries, cudaMPI and glMPI, that provide an MPI-like message passing interface to communicate data stored on the graphics cards of a distributed-memory parallel computer. These libraries can help applications that perform general purpose computations on these networked GPU clusters. We explore how to efficiently support both point-to-point and collective communication for either contiguous or noncontiguous data on modern graphics cards. Our software design is informed by a detailed analysis of the actual performance of modern graphics hardware, for which we develop and test a simple but useful performance model. I. INTRODUCTION AND PRIOR WORK In 1968 Myer and Sutherland [1] asked a sur...
GPUs are widely used in high performance computing, due to their high computational power and high p...
Abstract—Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) ...
Abstract—Data movement in high-performance computing systems accelerated by graphics processing unit...
This paper explores the challenges in implementing a message passing interface usable on systems wit...
Today, GPUs and other parallel accelerators are widely used in high performance computing, due to th...
After the introduction of CUDA by Nvidia, the GPUs became devices capable of accelerating any genera...
Current trends in computing and system architecture point towards a need for accelerators such as GP...
Communication hardware and software have a significant impact on the performance of clusters and sup...
Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics ap...
Parallel computing on clusters of workstations and personal computers has very high potential, sinc...
International audienceHeterogeneous supercomputers are now considered the most valuable solution to ...
Parallel computing on clusters of workstations and personal computers has very high potential, since...
Graphic Processing Units (GPUs) are widely used in high performance computing, due to their high com...
Modern HPC platforms are using multiple CPU, GPUs and high-performance interconnects per node. Unfor...
The introduction and rise of General Purpose Graphics Computing has significantly impacted parallel ...
GPUs are widely used in high performance computing, due to their high computational power and high p...
Abstract—Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) ...
Abstract—Data movement in high-performance computing systems accelerated by graphics processing unit...
This paper explores the challenges in implementing a message passing interface usable on systems wit...
Today, GPUs and other parallel accelerators are widely used in high performance computing, due to th...
After the introduction of CUDA by Nvidia, the GPUs became devices capable of accelerating any genera...
Current trends in computing and system architecture point towards a need for accelerators such as GP...
Communication hardware and software have a significant impact on the performance of clusters and sup...
Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics ap...
Parallel computing on clusters of workstations and personal computers has very high potential, sinc...
International audienceHeterogeneous supercomputers are now considered the most valuable solution to ...
Parallel computing on clusters of workstations and personal computers has very high potential, since...
Graphic Processing Units (GPUs) are widely used in high performance computing, due to their high com...
Modern HPC platforms are using multiple CPU, GPUs and high-performance interconnects per node. Unfor...
The introduction and rise of General Purpose Graphics Computing has significantly impacted parallel ...
GPUs are widely used in high performance computing, due to their high computational power and high p...
Abstract—Current implementations of MPI are unaware of accelerator memory (i.e., GPU device memory) ...
Abstract—Data movement in high-performance computing systems accelerated by graphics processing unit...