Collective operations are common features of parallel programming models that are frequently used in High-Performance (HPC) and machine/ deep learning (ML/ DL) applications. In strong scaling scenarios, collective operations can negatively impact the overall application performance: with the increase in core count, the load per rank decreases, while the time spent in collective operations increases logarithmically. In this article, we propose a design of eventually consistent collectives suitable for ML/ DL computations by reducing communication in Broadcast and Reduce, as well as by exploring the Stale Synchronous Parallel (SSP) synchronization model for the Allreduce collective. Moreover, we also enrich the GASPI ecosystem with frequent...
Collective communication allows efficient communication and synchronization among a collection of pr...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
Large-scale iterative computations are common in many important data mining and machine learning alg...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
Collective operations are commonly used in various parts of scientific applications. Especially in s...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
127 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2005.In this thesis, we motivate t...
The next generations of supercomputers are projected to have hun-dreds of thousands of processors. H...
Abstract. Most parallel systems on which MPI is used are now hierar-chical: some processors are much...
The parallel computing model used in this paper, the Collective Computing Model (CCM), is a variant ...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Collective operations are among the most important communication operations in shared- and distribut...
Optimized collective operations are a crucial performance factor for many scientific applications. T...
The parallel computing model used in this paper, the Collective Computing Model (CCM), is a variant ...
Collective communication allows efficient communication and synchronization among a collection of pr...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
Large-scale iterative computations are common in many important data mining and machine learning alg...
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014...
Collective operations are commonly used in various parts of scientific applications. Especially in s...
Collective communications occupy 20-90% of total execution times in many MPI applications. In this p...
127 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2005.In this thesis, we motivate t...
The next generations of supercomputers are projected to have hun-dreds of thousands of processors. H...
Abstract. Most parallel systems on which MPI is used are now hierar-chical: some processors are much...
The parallel computing model used in this paper, the Collective Computing Model (CCM), is a variant ...
Technology trends suggest that future machines will rely on parallelism to meet increasing performan...
This work presents and evaluates algorithms for MPI collective communication operations on high perf...
Collective operations are among the most important communication operations in shared- and distribut...
Optimized collective operations are a crucial performance factor for many scientific applications. T...
The parallel computing model used in this paper, the Collective Computing Model (CCM), is a variant ...
Collective communication allows efficient communication and synchronization among a collection of pr...
Technology trends suggest that future machines will relyon parallelism to meet increasing performanc...
Large-scale iterative computations are common in many important data mining and machine learning alg...