signSGD with Majority Vote is Communication Efficient And Fault Tolerant

Bernstein, Jeremy
Zhao, Jiawei
Azizzadenesheli, Kamyar
Anandkumar, Anima

Publication date

October 2018

Abstract

Training neural networks on large datasets can be accelerated by distributing the workload over a network of machines. As datasets grow ever larger, networks of hundreds or thousands of machines become economically viable. The time cost of communicating gradients limits the effectiveness of using such large machine counts, as may the increased chance of network faults. We explore a particularly simple algorithm for robust, communication-efficient learning---signSGD. Workers transmit only the sign of their gradient vector to a server, and the overall update is decided by a majority vote. This algorithm uses 32× less communication per iteration than full-precision, distributed SGD. Under natural conditions verified by experiment, we prove tha...

Extracted data

We use cookies to provide a better user experience.

Data Protection

signSGD with Majority Vote is Communication Efficient And Fault Tolerant

Abstract

Extracted data

signSGD with Majority Vote is Communication Efficient And Fault Tolerant

Abstract

Extracted data

Related items

Related items