We explore the multisend interface as a data mover interface to optimize applications with neighborhood col-lective communication operations. One of the limitations of the current MPI 2.1 standard is that the vector collective calls require counts and displacements (zero and non-zero bytes) to be specified for all the processors in the communicator. Further, all the collective calls in MPI 2.1 are blocking and do not permit overlap of communication with computation. We present the record replay persistent optimization to the multisend interface that minimizes the processor overhead of initiating the collective. We present four different case studies with the multisend API on Blue Gene/P (i) 3D-FFT, (ii) 4D nearest neighbor exchange as used ...
NAMD is a scalable molecular dynamics application, which has proven its performance on several paral...
Abstract—We have investigated the performance characteristics of hardware transactional memory (HTM)...
International audienceNon-blocking collectives have been proposed so as to allow communications to b...
The performance of several spike exchange methods using a Blue Gene/P supercomputerhas been tested w...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
Nanoscale communication will expand the scope of nanotechnology and bring new applications to the fu...
AbstractWe discuss issues in designing sparse (nearest neigh-bor) collective operations for communic...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
This talk discusses optimized collective algorithms and the benefits of leveraging independent hardw...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
The current trends in high performance computing show that large machines with tens of thousands of ...
We consider the problem of communication avoidance in computing interactions between a set of partic...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
NAMD is a scalable molecular dynamics application, which has proven its performance on several paral...
Abstract—We have investigated the performance characteristics of hardware transactional memory (HTM)...
International audienceNon-blocking collectives have been proposed so as to allow communications to b...
The performance of several spike exchange methods using a Blue Gene/P supercomputerhas been tested w...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
Nanoscale communication will expand the scope of nanotechnology and bring new applications to the fu...
AbstractWe discuss issues in designing sparse (nearest neigh-bor) collective operations for communic...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
International audienceTo amortize the cost of MPI collective operations, non-blocking collectives ha...
This talk discusses optimized collective algorithms and the benefits of leveraging independent hardw...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
The current trends in high performance computing show that large machines with tens of thousands of ...
We consider the problem of communication avoidance in computing interactions between a set of partic...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
NAMD is a scalable molecular dynamics application, which has proven its performance on several paral...
Abstract—We have investigated the performance characteristics of hardware transactional memory (HTM)...
International audienceNon-blocking collectives have been proposed so as to allow communications to b...