International audienceOverlapping communications with computation is an efficient way to amortize the cost of communications of an HPC application. To do so, it is possible to utilize MPI nonblocking primitives so that communications run in background alongside computation. However, these mechanisms rely on communications actually making progress in the background, which may not be true for all MPI libraries. Some MPI libraries leverage a core dedicated to communications to ensure communication progression. However, taking a core away from the application for such purpose may have a negative impact on the overall execution time. It may be difficult to know when such dedicated core is actually helpful. In this paper, we propose a model for t...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...
Click on the DOI link to access the article (may not be free).The advancement of multicore systems d...
This paper presents a portable optimization for MPI communications, called PRAcTICaL-MPI (Portable A...
International audienceOverlapping communications with computation is an efficient way to amortize th...
In this report we describe the conversion of a simple Master-Worker parallel program from global blo...
In High Performance Computing (HPC), minimizing communication overhead is one of the most important ...
International audienceIn HPC applications, one of the major overhead compared to sequentiel code, is...
International audienceBy allowing computation/communication overlap, MPI nonblocking collectives (NB...
With processor speeds no longer doubling every 18-24 months owing to the exponential increase in pow...
International audienceWith the growing number of cores and fast network like Infiniband, one of the ...
MPI is widely used for programming large HPC clusters. MPI also includes persistent operations, whic...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
This article demonstrates the performance benefits of the MPI-3 nonblocking collective operations su...
Communication hardware and software have a significant impact on the performance of clusters and sup...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...
Click on the DOI link to access the article (may not be free).The advancement of multicore systems d...
This paper presents a portable optimization for MPI communications, called PRAcTICaL-MPI (Portable A...
International audienceOverlapping communications with computation is an efficient way to amortize th...
In this report we describe the conversion of a simple Master-Worker parallel program from global blo...
In High Performance Computing (HPC), minimizing communication overhead is one of the most important ...
International audienceIn HPC applications, one of the major overhead compared to sequentiel code, is...
International audienceBy allowing computation/communication overlap, MPI nonblocking collectives (NB...
With processor speeds no longer doubling every 18-24 months owing to the exponential increase in pow...
International audienceWith the growing number of cores and fast network like Infiniband, one of the ...
MPI is widely used for programming large HPC clusters. MPI also includes persistent operations, whic...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
This article demonstrates the performance benefits of the MPI-3 nonblocking collective operations su...
Communication hardware and software have a significant impact on the performance of clusters and sup...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...
Click on the DOI link to access the article (may not be free).The advancement of multicore systems d...
This paper presents a portable optimization for MPI communications, called PRAcTICaL-MPI (Portable A...