This work presents and evaluates algorithms for MPI collective communication operations on high performance systems. Collective communication algorithms are extensively investigated, and a universal algorithm to improve the performance of MPI collective operations on hierarchical clusters is introduced. This algorithm exploits shared-memory buffers for efficient intra-node communication while still allowing the use of unmodified, hierarchy-unaware traditional collectives for inter-node communication. The universal algorithm shows impressive performance results with a variety of collectives, improving upon the MPICH algorithms as well as the Cray MPT algorithms. Speedups average 15x - 30x for most collectives with improved scalability up to ...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
The performance of collective communication operations is one of the deciding factors in the overa...
This paper describes a novel methodology for implementing a common set of collective communication o...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
Collective communication is an important subset of Message Passing Interface. Improving the perform...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
Many parallel applications from scientific computing use MPI collective communication operations to ...
Abstract Many parallel applications from scientific computing use collective MPI communication oper-...
Collective communication allows efficient communication and synchronization among a collection of pr...
In order for collective communication routines to achieve high performance on different platforms, t...
Two-phase I/O is a well-known strategy for implementing collective MPI-IO functions. It redistribute...
Many parallel applications from scientific computing use MPI collective communication operations to ...
Abstract. Most parallel systems on which MPI is used are now hierar-chical: some processors are much...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
The performance of collective communication operations is one of the deciding factors in the overa...
This paper describes a novel methodology for implementing a common set of collective communication o...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
Collective communication is an important subset of Message Passing Interface. Improving the perform...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
Many parallel applications from scientific computing use MPI collective communication operations to ...
Abstract Many parallel applications from scientific computing use collective MPI communication oper-...
Collective communication allows efficient communication and synchronization among a collection of pr...
In order for collective communication routines to achieve high performance on different platforms, t...
Two-phase I/O is a well-known strategy for implementing collective MPI-IO functions. It redistribute...
Many parallel applications from scientific computing use MPI collective communication operations to ...
Abstract. Most parallel systems on which MPI is used are now hierar-chical: some processors are much...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
The performance of collective communication operations is one of the deciding factors in the overa...
This paper describes a novel methodology for implementing a common set of collective communication o...