We have implemented eight of the MPI collective routines using MPI point-to-point communication routines with algorithms designed to be efficient for large messages. The performance of our implementations of these collective routines is compared with the vendor implementations on the Cray T3E-600, the Cray Origin 2000 and on the IBM SP. Many of our implementations significantly outperformed vendor implementations on the T3E and the Origin 2000. On the SP, only our implementation of the broadcast significantly outperformed IBM's implementation
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3...
Many parallel applications from scientific computing use MPI collective communication operations to ...
The primary purpose of this technical report was to evaluate the performance of the MPI-2 one-sided...
We have implemented eight of the MPI collective routines using MPI point-to-point communication rou...
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3...
Collective communication is an important subset of Message Passing Interface. Improving the perform...
The message passing interface standard released in April 1994 by the MPI Forum [2], defines a set of...
In order for collective communication routines to achieve high performance on different platforms, t...
We give an overview of the algorithms and implementations in the high-performance MPI libraries MPI/...
Previous studies of application usage show that the per-formance of collective communications are cr...
T3E-900, the Cray Origin 2000 and the IBM P2SC on a collection of 13 communication tests. These test...
In this paper the parallel benchmark code PSTSWM is used to evaluate the performance of the vendor-s...
. The performance of collective communication is critical to the overall system performance. In gene...
The performance of collective communication operations is one of the deciding factors in the overa...
We discuss the design and high-performance implementation of collective communications operations on...
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3...
Many parallel applications from scientific computing use MPI collective communication operations to ...
The primary purpose of this technical report was to evaluate the performance of the MPI-2 one-sided...
We have implemented eight of the MPI collective routines using MPI point-to-point communication rou...
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3...
Collective communication is an important subset of Message Passing Interface. Improving the perform...
The message passing interface standard released in April 1994 by the MPI Forum [2], defines a set of...
In order for collective communication routines to achieve high performance on different platforms, t...
We give an overview of the algorithms and implementations in the high-performance MPI libraries MPI/...
Previous studies of application usage show that the per-formance of collective communications are cr...
T3E-900, the Cray Origin 2000 and the IBM P2SC on a collection of 13 communication tests. These test...
In this paper the parallel benchmark code PSTSWM is used to evaluate the performance of the vendor-s...
. The performance of collective communication is critical to the overall system performance. In gene...
The performance of collective communication operations is one of the deciding factors in the overa...
We discuss the design and high-performance implementation of collective communications operations on...
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3...
Many parallel applications from scientific computing use MPI collective communication operations to ...
The primary purpose of this technical report was to evaluate the performance of the MPI-2 one-sided...