International audienceRandomly initialized neural networks are known to become harder to train with increasing depth, unless architectural enhancements like residual connections and batch normalization are used. We here investigate this phenomenon by revisiting the connection between random initialization in deep networks and spectral instabilities in products of random matrices. Given the rich literature on random matrices, it is not surprising to find that the rank of the intermediate representations in unnormalized networks collapses quickly with depth. In this work we highlight the fact that batch normalization is an effective strategy to avoid rank collapse for both linear and ReLU networks. Leveraging tools from Markov chain theory, w...
In this thesis, we consider resource limitations on machine learning algorithms in a variety of sett...
© 2018 Curran Associates Inc.All rights reserved. Batch Normalization (BatchNorm) is a widely adopte...
Batch Normalization is an essential component of all state-of-the-art neural networks architectures....
International audienceThis paper underlines a subtle property of batch-normalization (BN): Successiv...
We propose a novel low-rank initialization framework for training low-rank deep neural networks -- n...
Exciting new work on generalization bounds for neural networks (NN) given by Bartlett et al. (2017);...
Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks...
Deep Residual Networks (ResNets) have recently achieved state-of-the-art results on many challenging...
Optimization is the key component of deep learning. Increasing depth, which is vital for reaching a...
We analyze deep ReLU neural networks trained with mini-batch stochastic gradient decent and weight d...
Deep equilibrium networks (DEQs) are a promising way to construct models which trade off memory for ...
Modern deep neural networks are highly over-parameterized compared to the data on which they are tra...
Despite the widespread practical success of deep learning methods, our theoretical understanding of ...
The activation function deployed in a deep neural network has great influence on the performance of ...
Modern machine learning models, particularly those used in deep networks, are characterized by massi...
In this thesis, we consider resource limitations on machine learning algorithms in a variety of sett...
© 2018 Curran Associates Inc.All rights reserved. Batch Normalization (BatchNorm) is a widely adopte...
Batch Normalization is an essential component of all state-of-the-art neural networks architectures....
International audienceThis paper underlines a subtle property of batch-normalization (BN): Successiv...
We propose a novel low-rank initialization framework for training low-rank deep neural networks -- n...
Exciting new work on generalization bounds for neural networks (NN) given by Bartlett et al. (2017);...
Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks...
Deep Residual Networks (ResNets) have recently achieved state-of-the-art results on many challenging...
Optimization is the key component of deep learning. Increasing depth, which is vital for reaching a...
We analyze deep ReLU neural networks trained with mini-batch stochastic gradient decent and weight d...
Deep equilibrium networks (DEQs) are a promising way to construct models which trade off memory for ...
Modern deep neural networks are highly over-parameterized compared to the data on which they are tra...
Despite the widespread practical success of deep learning methods, our theoretical understanding of ...
The activation function deployed in a deep neural network has great influence on the performance of ...
Modern machine learning models, particularly those used in deep networks, are characterized by massi...
In this thesis, we consider resource limitations on machine learning algorithms in a variety of sett...
© 2018 Curran Associates Inc.All rights reserved. Batch Normalization (BatchNorm) is a widely adopte...
Batch Normalization is an essential component of all state-of-the-art neural networks architectures....