More memory hierarchies, NUMA architectures and network-style interconnection are widely used in modern many-core CPU design to achieve performance scalability. As a leading intra-node programming model, Message Passing Interface (MPI) implementations must exploit these architectures to provide reliable performance portability. These new architectures not only require specialized MPI point-to-point messaging protocols, they also require carefully designed and tuned algorithms for MPI collective operations. Multiple issues must be taken into account: 1) minimizing the number of copies required, 2) minimizing traffic to ''remote'' NUMA memory, and 3) carefully avoiding memory bottlenecks for ''rooted'' collective operations. In this paper, we...
International audienceNon-blocking collectives have been proposed so as to allow communications to b...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
International audienceAs the number of cores per node increases in modern clusters, intra-node commu...
International audienceThe multiplication of cores in today's architectures raises the importance of ...
Multicore or many-core clusters have become the most prominent form of High Performance Computing (H...
International audienceThe emergence of multicore processors raises the need to efficiently transfer ...
This is a post-peer-review, pre-copyedit version of an article published in [insert journal title]. ...
International audienceThe increasing number of cores led to scalability issues in modern servers tha...
The increasing number of cores per processor is turning multicore-based systems in pervasive. This i...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
International audienceThe increasing number of cores per node in high-performance computing requires...
International audienceMulticore processors have not only reintroduced Non-Uniform Memory Access (NUM...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
Abstract. Over the last decade, Message Passing Interface (MPI) has become a very successful paralle...
International audienceNon-blocking collectives have been proposed so as to allow communications to b...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
International audienceAs the number of cores per node increases in modern clusters, intra-node commu...
International audienceThe multiplication of cores in today's architectures raises the importance of ...
Multicore or many-core clusters have become the most prominent form of High Performance Computing (H...
International audienceThe emergence of multicore processors raises the need to efficiently transfer ...
This is a post-peer-review, pre-copyedit version of an article published in [insert journal title]. ...
International audienceThe increasing number of cores led to scalability issues in modern servers tha...
The increasing number of cores per processor is turning multicore-based systems in pervasive. This i...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
International audienceThe increasing number of cores per node in high-performance computing requires...
International audienceMulticore processors have not only reintroduced Non-Uniform Memory Access (NUM...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
Abstract. Over the last decade, Message Passing Interface (MPI) has become a very successful paralle...
International audienceNon-blocking collectives have been proposed so as to allow communications to b...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...