International audienceSince the last decade, most of the supercomputer architectures are based on cluster of SMP nodes. In those architectures the exchanges between processors are made through shared memory when the processor are located on a same SMP node and through the network otherwise. Generally, the MPI implementations provided by the constructor on those machines are adapted to this situation and take advantage of the share memory to treat messages between processors in a same SMP node. Nevertheless, this transparent approach to exploit shared memory do not avoid the storage of buffers needed in asynchronous communications. In the parallel direct solvers the storage of these buffers can become a bottleneck. In this paper, we propose ...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
MPI is a message-passing standard widely used for developing high-performance parallel applications....
A recent trend in high performance computing shows a rising number of cores per compute node, while ...
Abstract—With the increasing prominence of many-core archi-tectures and decreasing per-core resource...
On using an hybrid MPI-Thread programming for the implementation of a parallel sparse direc
As high-end computing systems continue to grow in scale, recent advances in multi- and many-core arc...
With the end of Dennard scaling, future high performance computers are expected to consist of distri...
Hybrid MPI+threads programming is gaining prominence as an alternative to the traditional "MPI every...
Abstract. To make the most effective use of parallel machines that are being built out of increasing...
Supercomputing applications rely on strong scaling to achieve faster results on a larger number of p...
Many-core architectures, such as the Intel Xeon Phi, provide dozens of cores and hundreds of hardwar...
Communication overhead is one of the dominant factors affecting performance in high-end computing sy...
Our study proposes a novel MPI-only parallel programming model with improved performance for SMP clu...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
MPI is a message-passing standard widely used for developing high-performance parallel applications....
A recent trend in high performance computing shows a rising number of cores per compute node, while ...
Abstract—With the increasing prominence of many-core archi-tectures and decreasing per-core resource...
On using an hybrid MPI-Thread programming for the implementation of a parallel sparse direc
As high-end computing systems continue to grow in scale, recent advances in multi- and many-core arc...
With the end of Dennard scaling, future high performance computers are expected to consist of distri...
Hybrid MPI+threads programming is gaining prominence as an alternative to the traditional "MPI every...
Abstract. To make the most effective use of parallel machines that are being built out of increasing...
Supercomputing applications rely on strong scaling to achieve faster results on a larger number of p...
Many-core architectures, such as the Intel Xeon Phi, provide dozens of cores and hundreds of hardwar...
Communication overhead is one of the dominant factors affecting performance in high-end computing sy...
Our study proposes a novel MPI-only parallel programming model with improved performance for SMP clu...
Most HPC systems are clusters of shared memory nodes. Parallel programming must combine the distribu...
This paper applies a Hybrid MPI-OpenMP program-ming model with a thread-to-thread communication meth...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
MPI is a message-passing standard widely used for developing high-performance parallel applications....
A recent trend in high performance computing shows a rising number of cores per compute node, while ...