Abstract. CFL (Communication Fusion Library) is an experimental C++ library which supports shared reduction variables in MPI programs. It uses overloading to distinguish private variables from replicated, shared variables, and automatically introduces MPI communication to keep replicated data consistent. This paper concerns a simple but surprisingly effective technique which improves performance substantially: CFL operators are executed lazily in order to expose opportunities for run-time, context-dependent, optimisation such as message aggregation and operator fusion. We evaluate the idea using both toy benchmarks and a `production ' code for simulating plankton population dynamics in the upper ocean. The results demonstrate the softw...
This paper presents an optimization of MPI communications, called CoMPI, based on run-time compressi...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...
MPI is widely used for programming large HPC clusters. MPI also includes persistent operations, whic...
MPI-based explicitly parallel programs have been widely used for developing highperformance applicat...
Abstract. The MPI datatype functionality provides a powerful tool for describing structured memory a...
This study focuses on gaining insights into the usage of the Message-Passing Interface (MPI) in a la...
Abstract — This paper reports on our experiences in parallelizing WaterGAP, an originally sequential...
Abstract Distributed memory architectures such as Linux clusters have become increasingly common but...
Programmers productivity has always been overlooked as compared to the performance optimizations in ...
optimization, Abstract—MPI is the de facto standard for portable parallel programming on high-end sy...
The availability of cheap computers with outstanding single-processor performance coupled with Ether...
Programmers productivity has always been overlooked as compared to the performance optimizations in ...
MPI is a message-passing standard widely used for developing high-performance parallel applications....
In this report we describe how to improve communication time of MPI parallel applications with the u...
This paper presents an optimization of MPI communications, called CoMPI, based on run-time compressi...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...
MPI is widely used for programming large HPC clusters. MPI also includes persistent operations, whic...
MPI-based explicitly parallel programs have been widely used for developing highperformance applicat...
Abstract. The MPI datatype functionality provides a powerful tool for describing structured memory a...
This study focuses on gaining insights into the usage of the Message-Passing Interface (MPI) in a la...
Abstract — This paper reports on our experiences in parallelizing WaterGAP, an originally sequential...
Abstract Distributed memory architectures such as Linux clusters have become increasingly common but...
Programmers productivity has always been overlooked as compared to the performance optimizations in ...
optimization, Abstract—MPI is the de facto standard for portable parallel programming on high-end sy...
The availability of cheap computers with outstanding single-processor performance coupled with Ether...
Programmers productivity has always been overlooked as compared to the performance optimizations in ...
MPI is a message-passing standard widely used for developing high-performance parallel applications....
In this report we describe how to improve communication time of MPI parallel applications with the u...
This paper presents an optimization of MPI communications, called CoMPI, based on run-time compressi...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can impl...
We examine the mechanics of the send and receive mechanism of MPI and in particular how we can imple...