Accelerating distributed neural network training with network-centric approach

Yuan, Yifan

Publication date

March 2022

Abstract

Distributed training of Deep Neural Networks (DNN) is an important technique to reduce the training time of large DNNs for a wide range of applications. In existing distributed training approaches, however, the communication time to periodically exchange parameters (i.e., weights) and gradients among computer nodes over the network constitutes a large fraction of the total training time. To reduce the communication time, we propose an algorithm/hardware co-design, INCEPTIONN. More specifically, observing that gradients are much more tolerant to precision loss than parameters, we first propose a gradient-centric distributed training algorithm. As designed to exchange only gradients among nodes in a distributed manner, it can transfer less ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Accelerating distributed neural network training with network-centric approach

Abstract

Extracted data

Accelerating distributed neural network training with network-centric approach

Abstract

Extracted data

Related items

Related items