The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting, that is, accurate predictions despite overfitting training data. In this article, we survey recent p...
The study of the implicit regularization induced by gradient-based optimization in deep learning is ...
© 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
© 2020 National Academy of Sciences. All rights reserved. While deep learning is successful in a num...
Conventional wisdom in deep learning states that increasing depth improves expressiveness but compli...
Neural networks trained via gradient descent with random initialization and without any regularizati...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
Current deep neural networks are highly overparameterized (up to billions of connection weights) and...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
This paper aims to investigate the limits of deep learning by exploring the issue of overfitting in ...
[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we...
The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodol...
© 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit ...
The study of the implicit regularization induced by gradient-based optimization in deep learning is ...
© 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
© 2020 National Academy of Sciences. All rights reserved. While deep learning is successful in a num...
Conventional wisdom in deep learning states that increasing depth improves expressiveness but compli...
Neural networks trained via gradient descent with random initialization and without any regularizati...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
Current deep neural networks are highly overparameterized (up to billions of connection weights) and...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
This paper aims to investigate the limits of deep learning by exploring the issue of overfitting in ...
[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we...
The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodol...
© 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit ...
The study of the implicit regularization induced by gradient-based optimization in deep learning is ...
© 2020, The Author(s). Overparametrized deep networks predict well, despite the lack of an explicit ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...