Gradient coding with dynamic clustering for straggler-tolerant distributed learning

Buyukates, B
Ozfatura, E
Ulukus, S
Gunduz, D

Open PDF

Open link

Publication date

April 2022

DOI

10.1109/tcomm.2022.3166902

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Journal

IEEE Transactions on Communications

Language

English

Abstract

Distributed implementations are crucial in speeding up large scale machine learning applications. Distributed gradient descent (GD) is widely employed to parallelize the learning task by distributing the dataset across multiple workers. A significant performance bottleneck for the per-iteration completion time in distributed synchronous GD is straggling workers. Coded distributed computation techniques have been introduced recently to mitigate stragglers and to speed up GD iterations by assigning redundant computations to workers. In this paper, we introduce a novel paradigm of dynamic coded computation, which assigns redundant data to workers to acquire the flexibility to dynamically choose from among a set of possible codes depending on t...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Gradient coding with dynamic clustering for straggler-tolerant distributed learning

Abstract

Extracted data

Gradient coding with dynamic clustering for straggler-tolerant distributed learning

Abstract

Extracted data

Related items

Related items