The dragonfly network topology has recently gained traction in the design of high performance computing (HPC) systems and has been implemented in large-scale supercomputers. The impact of task mapping, i.e., placement of MPI ranks onto compute cores, on the communication performance of applications on dragonfly networks has not been comprehensively investigated on real large-scale systems. This paper demonstrates that task mapping affects the communication overhead significantly in dragonflies and the magnitude of this effect is sensitive to the application, job size, and the OpenMP settings. Among the three task mapping algorithms we study (in-order, random, and recursive coordinate bisection), selecting a suitable task mapper reduces appl...
International audienceHigh-radix direct network topologies such as Dragonfly have been proposed for ...
Large-scale compute clusters are highly affected by performance variability that originates from dif...
Abstract—We present a new method for mapping applica-tions ’ MPI tasks to cores of a parallel comput...
The dragonfly topology is becoming a popular choice for building high-radix, low-diameter networks w...
Considering the large number of processors and the size of the interconnection networks on exascale ...
Communication is a necessary but overhead inducing component of parallel programming. Its impact on ...
Petascale machines with hundreds of thousands of cores are being built. These machines have varying ...
Abhinav Bhatele, Ph.D. student at the Parallel Programming Lab at the University of Illinois present...
Network contention has an increasingly adverse effect on the performance of parallel applications wi...
Large-scale multiprocessor computers have numerous communicating components, and therefore place gre...
Networks are the backbone of modern HPC systems. They serve as a critical piece of infrastructure, t...
The overall efficiency of an extreme-scale supercomputer largely relies on the performance of its ne...
Abstract—Interconnection networks are a critical resource for large supercomputers. The dragonfly to...
In the early years of parallel computing research, significant theoretical studies were done on inte...
Dragonflies are one of the most promising topologies for the Exascale effort for their scalability a...
International audienceHigh-radix direct network topologies such as Dragonfly have been proposed for ...
Large-scale compute clusters are highly affected by performance variability that originates from dif...
Abstract—We present a new method for mapping applica-tions ’ MPI tasks to cores of a parallel comput...
The dragonfly topology is becoming a popular choice for building high-radix, low-diameter networks w...
Considering the large number of processors and the size of the interconnection networks on exascale ...
Communication is a necessary but overhead inducing component of parallel programming. Its impact on ...
Petascale machines with hundreds of thousands of cores are being built. These machines have varying ...
Abhinav Bhatele, Ph.D. student at the Parallel Programming Lab at the University of Illinois present...
Network contention has an increasingly adverse effect on the performance of parallel applications wi...
Large-scale multiprocessor computers have numerous communicating components, and therefore place gre...
Networks are the backbone of modern HPC systems. They serve as a critical piece of infrastructure, t...
The overall efficiency of an extreme-scale supercomputer largely relies on the performance of its ne...
Abstract—Interconnection networks are a critical resource for large supercomputers. The dragonfly to...
In the early years of parallel computing research, significant theoretical studies were done on inte...
Dragonflies are one of the most promising topologies for the Exascale effort for their scalability a...
International audienceHigh-radix direct network topologies such as Dragonfly have been proposed for ...
Large-scale compute clusters are highly affected by performance variability that originates from dif...
Abstract—We present a new method for mapping applica-tions ’ MPI tasks to cores of a parallel comput...