Abstract. Most parallel systems on which MPI is used are now hierar-chical: some processors are much closer to others in terms of interconnect performance. One of the most common such examples are systems whose nodes are symmetric multiprocessors (including “multicore ” processors). A number of papers have developed algorithms and implementations that exploit shared memory on such nodes to provide optimized collective op-erations, and these show significant performance benefits compared to implementations that do not exploit the hierarchical structure of the nodes. However, shared memory between processes is often a scarce re-source. How necessary is it to use shared memory for collectives in MPI? How much of the performance benefit comes f...
Multicore or many-core clusters have become the most prominent form of High Performance Computing (H...
Abstract Many parallel applications from scientific computing use collective MPI communication oper-...
Collective communication is an important subset of Message Passing Interface. Improving the perform...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
The eÆcient implementation of collective commu-nication operations has received much attention. Ini-...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
Collective operations are common features of parallel programming models that are frequently used in...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
Many parallel applications from scientific computing use MPI collective communication operations to ...
We discuss the design and high-performance implementation of collective communications operations on...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
This talk discusses optimized collective algorithms and the benefits of leveraging independent hardw...
International audience—We describe how 2-level memory hierarchies can be exploited to optimize the i...
This paper describes a novel methodology for implementing a common set of collective communication o...
Multicore or many-core clusters have become the most prominent form of High Performance Computing (H...
Abstract Many parallel applications from scientific computing use collective MPI communication oper-...
Collective communication is an important subset of Message Passing Interface. Improving the perform...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
The eÆcient implementation of collective commu-nication operations has received much attention. Ini-...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
Collective operations are common features of parallel programming models that are frequently used in...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
Many parallel applications from scientific computing use MPI collective communication operations to ...
We discuss the design and high-performance implementation of collective communications operations on...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
This talk discusses optimized collective algorithms and the benefits of leveraging independent hardw...
International audience—We describe how 2-level memory hierarchies can be exploited to optimize the i...
This paper describes a novel methodology for implementing a common set of collective communication o...
Multicore or many-core clusters have become the most prominent form of High Performance Computing (H...
Abstract Many parallel applications from scientific computing use collective MPI communication oper-...
Collective communication is an important subset of Message Passing Interface. Improving the perform...