The Global Address Space Programming Interface (GPI) is the PGAS-API developed at the Fraunhofer ITWM. It is a thin layer, that delivers the full performance of remote direct memory access (RDMA) enabled networks to the application without interrupting the CPU. We will introduce GPI briefly and compare the application performance between GPI implementations of a large CFD code (TAU), of a quantum physics code (BQCD) and a load balancing benchmark (UTS) and corresponding MPI implementations. We will show which steps are necessary to (re-)implement existing MPI codes in GPI. We will see that the GPI implementations not only perform better and are more robust but more importantly, they scale better at the same time. We will argue that heteroge...
Modern HPC platforms are using multiple CPU, GPUs and high-performance interconnects per node. Unfor...
The recent developments in computer architectures progress towards systems with large core count (Ma...
At the threshold to exascale computing, limitations of the MPI programming model become more and mor...
One of the main hurdles of partitioned global address space (PGAS) approaches is the dominance of me...
Whereas most applications in the realm of the partitioned global address space make use of PGAS lan...
Partitioned Global Address Space (PGAS) languages and one-sided communication enable application dev...
The Partitioned Global Address Space (PGAS) model is a parallel programming model that aims to im-pr...
The Message Passing Interface (MPI) is the library-based programming model employed by most scalable...
We compare the BQCD performance, a typical high performance computer application, using either the M...
The Message Passing Interface (MPI) is the library-based programming model employed by most scalable...
There is an emerging need for adaptive, lightweight communication in irregular HPC applications at e...
Abstract—Partitioned Global Address Space (PGAS) program-ming models provide a convenient approach t...
This work presents a dynamic development flow to integrate FPGA accelerators into software applicati...
We are presenting THeGASNet, a framework to provide remote memory communication and synchronization ...
Abstract—Data movement in high-performance computing systems accelerated by graphics processing unit...
Modern HPC platforms are using multiple CPU, GPUs and high-performance interconnects per node. Unfor...
The recent developments in computer architectures progress towards systems with large core count (Ma...
At the threshold to exascale computing, limitations of the MPI programming model become more and mor...
One of the main hurdles of partitioned global address space (PGAS) approaches is the dominance of me...
Whereas most applications in the realm of the partitioned global address space make use of PGAS lan...
Partitioned Global Address Space (PGAS) languages and one-sided communication enable application dev...
The Partitioned Global Address Space (PGAS) model is a parallel programming model that aims to im-pr...
The Message Passing Interface (MPI) is the library-based programming model employed by most scalable...
We compare the BQCD performance, a typical high performance computer application, using either the M...
The Message Passing Interface (MPI) is the library-based programming model employed by most scalable...
There is an emerging need for adaptive, lightweight communication in irregular HPC applications at e...
Abstract—Partitioned Global Address Space (PGAS) program-ming models provide a convenient approach t...
This work presents a dynamic development flow to integrate FPGA accelerators into software applicati...
We are presenting THeGASNet, a framework to provide remote memory communication and synchronization ...
Abstract—Data movement in high-performance computing systems accelerated by graphics processing unit...
Modern HPC platforms are using multiple CPU, GPUs and high-performance interconnects per node. Unfor...
The recent developments in computer architectures progress towards systems with large core count (Ma...
At the threshold to exascale computing, limitations of the MPI programming model become more and mor...