Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Castelló, Adrián
Catalán Carbó, Mar
Dolz, Manuel F.
Quintana-Orti, Enrique S.
Duato, José

Open PDF

Open link

Publication date

January 2022

DOI

10.1007/s00607-021-01029-2

Publisher

Springer Science and Business Media LLC

Language

English

Abstract

For many distributed applications, data communication poses an important bottleneck from the points of view of performance and energy consumption. As more cores are integrated per node, in general the global performance of the system increases yet eventually becomes limited by the interconnection network. This is the case for distributed data-parallel training of convolutional neural networks (CNNs), which usually proceeds on a cluster with a small to moderate number of nodes. In this paper, we analyze the performance of the Allreduce collective communication primitive, a key to the efficient data-parallel distributed training of CNNs. Our study targets the distinct realizations of this primitive in three high performance instances ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Abstract

Extracted data

Analyzing the impact of the MPI allreduce in distributed training of convolutional neural networks

Abstract

Extracted data

Related items

Related items