International audienceHPC systems have experienced significant growth over the past years, with modern machines having hundreds of thousands of nodes. Message Passing Interface (MPI) is the de facto standard for distributed computing on these architectures. On the MPI critical path, the message-matching process is one of the most time-consuming operations. In this process, searching for a specific request in a message queue represents a significant part of the communication latency. So far, no miracle algorithm performs well in all cases. This paper explores potential matching specializations thanks to hints introduced in the latest MPI 4.0 standard. We propose a hash-table-based algorithm that performs constant time message-matching for no...
optimization, Abstract—MPI is the de facto standard for portable parallel programming on high-end sy...
MPI libraries are widely used in applications of high performance computing. Yet, effective tuning o...
International audienceOverlapping communications with computation is an efficient way to amortize th...
International audienceHPC systems have experienced significant growth over the past years, with mode...
MPI is widely used for programming large HPC clusters. MPI also includes persistent operations, whic...
The availability of cheap computers with outstanding single-processor performance coupled with Ether...
Communication hardware and software have a significant impact on the performance of clusters and sup...
International audienceNew kinds of applications with lots of threads or irregular communication patt...
Over the last few decades, Message Passing Interface (MPI) has become the parallel-communication sta...
This paper presents a portable optimization for MPI communications, called PRAcTICaL-MPI (Portable A...
Modern high performance computing (HPC) applications, for example adaptive mesh refinement and mul...
Message matching within MPI is an important performance consideration for applications that utilize ...
Supercomputing applications rely on strong scaling to achieve faster results on a larger number of p...
MPI libraries are widely used in applications of high performance computing. Yet, effective tuning o...
The Message-Passing Interface (MPI) is a widely-used standard library for programming parallel appli...
optimization, Abstract—MPI is the de facto standard for portable parallel programming on high-end sy...
MPI libraries are widely used in applications of high performance computing. Yet, effective tuning o...
International audienceOverlapping communications with computation is an efficient way to amortize th...
International audienceHPC systems have experienced significant growth over the past years, with mode...
MPI is widely used for programming large HPC clusters. MPI also includes persistent operations, whic...
The availability of cheap computers with outstanding single-processor performance coupled with Ether...
Communication hardware and software have a significant impact on the performance of clusters and sup...
International audienceNew kinds of applications with lots of threads or irregular communication patt...
Over the last few decades, Message Passing Interface (MPI) has become the parallel-communication sta...
This paper presents a portable optimization for MPI communications, called PRAcTICaL-MPI (Portable A...
Modern high performance computing (HPC) applications, for example adaptive mesh refinement and mul...
Message matching within MPI is an important performance consideration for applications that utilize ...
Supercomputing applications rely on strong scaling to achieve faster results on a larger number of p...
MPI libraries are widely used in applications of high performance computing. Yet, effective tuning o...
The Message-Passing Interface (MPI) is a widely-used standard library for programming parallel appli...
optimization, Abstract—MPI is the de facto standard for portable parallel programming on high-end sy...
MPI libraries are widely used in applications of high performance computing. Yet, effective tuning o...
International audienceOverlapping communications with computation is an efficient way to amortize th...