International audienceThe increasing number of cores per node in high-performance computing requires an efficient intra-node MPI communication subsystem. Most existing MPI implementations rely on two copies across a shared memory-mapped file. Open-MX offers a single-copy mechanism that is tightly integrated in its regular communication stack, making it transparently available to the MX backend of many MPI layers. We describe this implementation and its offloaded copy backend using I/OAT hardware. Memory pinning requirements are then discussed, and overlapped pinning is introduced to enable the start of Open-MX intra-node data transfer earlier. Performance evaluation shows that this local communication stack performs better than MPICH2 and O...
Clusters of several thousand nodes interconnected with InfiniBand, an emerging high-performance inte...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
A recent trend in high performance computing shows a rising number of cores per compute node, while ...
International audienceAs the number of cores per node increases in modern clusters, intra-node commu...
International audienceOpen-MX is a new message passing layer implemented on top of the generic Ether...
International audienceThe emergence of multicore processors raises the need to efficiently transfer ...
International audienceIn the last decade, cluster computing has become the most popular high-perform...
International audienceHigh-performance cluster networks achieve very high throughput thanks to zero-...
International audienceThe multiplication of cores in today's architectures raises the importance of ...
International audienceHigh-speed networking in clusters usually relies on advanced hardware features...
More memory hierarchies, NUMA architectures and network-style interconnection are widely used in mod...
International audienceOpen-MX is a new message passing layer implemented on top of the generic Ether...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
International audienceRunning parallel applications on clusters with high-speed local networks requi...
Abstract — Modern processors have multiple cores on a chip to overcome power consumption and heat di...
Clusters of several thousand nodes interconnected with InfiniBand, an emerging high-performance inte...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
A recent trend in high performance computing shows a rising number of cores per compute node, while ...
International audienceAs the number of cores per node increases in modern clusters, intra-node commu...
International audienceOpen-MX is a new message passing layer implemented on top of the generic Ether...
International audienceThe emergence of multicore processors raises the need to efficiently transfer ...
International audienceIn the last decade, cluster computing has become the most popular high-perform...
International audienceHigh-performance cluster networks achieve very high throughput thanks to zero-...
International audienceThe multiplication of cores in today's architectures raises the importance of ...
International audienceHigh-speed networking in clusters usually relies on advanced hardware features...
More memory hierarchies, NUMA architectures and network-style interconnection are widely used in mod...
International audienceOpen-MX is a new message passing layer implemented on top of the generic Ether...
In exascale computing era, applications are executed at larger scale than ever before, whichresults ...
International audienceRunning parallel applications on clusters with high-speed local networks requi...
Abstract — Modern processors have multiple cores on a chip to overcome power consumption and heat di...
Clusters of several thousand nodes interconnected with InfiniBand, an emerging high-performance inte...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
A recent trend in high performance computing shows a rising number of cores per compute node, while ...