Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks

Daneshmand, Hadi
Kohler, Jonas
Bach, Francis
Hofmann, Thomas
Lucchi, Aurelien

Publication date

December 2020

Publisher

HAL CCSD

Abstract

International audienceRandomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used. We here investigate this phenomenon by revisiting the connection between random initialization in deep networks and spectral instabilities in products of random matrices. Given the rich literature on random matrices, it is not surprising to find that the rank of the intermediate representations in unnormalized networks collapses quickly with depth. In this work we highlight the fact that batch normalization is an effective strategy to avoid rank collapse for both linear and ReLU networks. Leveraging tools from Markov chain theory, w...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks

Abstract

Extracted data

Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks

Abstract

Extracted data

Related items

Related items