Rétropropagation optimale de réseau profonds à forme de Joins

Beaumont, Olivier
Herrmann, Julien
Pallez, Guillaume
Shilova, Alena

Publication date

May 2019

Publisher

HAL CCSD

Abstract

In the context of Deep Learning training, memory needs to store activations can prevent the user to consider large models and large batch sizes. A possible solution is to rely on model parallelism to distribute the weights of the model and the activations over distributedmemory nodes. In this paper, we consider another purely sequential approach to save memory using checkpointing techniques. Checkpointing techniques have been introduced in the context of Automatic Differentiation. They consist in storing some, but not all activations during the feed-forward network training phase, and then to recompute missing values during the backward phase. Using this approach, it is possible, at the price of ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Rétropropagation optimale de réseau profonds à forme de Joins

Abstract

Extracted data

Rétropropagation optimale de réseau profonds à forme de Joins

Abstract

Extracted data

Related items

Related items