Collective communication allows efficient communication and synchronization among a collection of processes, unlike point-to-point communication that only involves a pair of communicating processes. Achieving high performance for both kernels and full-scale applications running on a distributed memory system requires an efficient implementation of collective communication operations. Developing an efficient implementation requires attention to both algorithmic and hardware issues. This dissertation proposes and describes the implementation of collective communication algorithms that are both novel and extremely efficient. These algorithms target distributed memory machines: both clusters (with nodes that are either SMPs or uniprocessors) an...
In this paper, we outline a unified approach for building a library of collective communication oper...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
: The significance of collective communication operations for scalable parallel systems has been wel...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
We discuss the design and high-performance implementation of collective communications operations on...
This paper describes a novel methodology for implementing a common set of collective communication o...
127 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2005.In this thesis, we motivate t...
We report on a project to develop a unified approach for building a library of collective communicat...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
Networks of Workstations (NOW) have become an attractive alternative platform for high performance c...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
The increasing number of cores per processor is turning multicore-based systems in pervasive. This i...
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3...
Abstract Many parallel applications from scientific computing use collective MPI communication oper-...
Collective operations are among the most important communication operations in shared- and distribut...
In this paper, we outline a unified approach for building a library of collective communication oper...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
: The significance of collective communication operations for scalable parallel systems has been wel...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
We discuss the design and high-performance implementation of collective communications operations on...
This paper describes a novel methodology for implementing a common set of collective communication o...
127 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2005.In this thesis, we motivate t...
We report on a project to develop a unified approach for building a library of collective communicat...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
Networks of Workstations (NOW) have become an attractive alternative platform for high performance c...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
The increasing number of cores per processor is turning multicore-based systems in pervasive. This i...
We evaluate the architectural support of collective communication operations on the IBM SP2, Cray T3...
Abstract Many parallel applications from scientific computing use collective MPI communication oper-...
Collective operations are among the most important communication operations in shared- and distribut...
In this paper, we outline a unified approach for building a library of collective communication oper...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
: The significance of collective communication operations for scalable parallel systems has been wel...