The general features of the optimization problem for the case of overparametrized nonlinear networks have been clear for a while: SGD selects with high probability global minima vs local minima. In the overparametrized case, the key question is not optimization of the empirical risk but optimization with a generalization guarantee. In fact, a main puzzle of deep neural networks (DNNs) revolves around the apparent absence of “overfitting”, defined as follows: the expected error does not get worse when increasing the number of neurons or of iterations of gradient descent. This is superficially surprising because of the large capacity demonstrated by DNNs to fit randomly labeled data and the absence of explicit regularization. Several recent e...
The remarkable practical success of deep learning has revealed some major surprises from a theoretic...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
© 2020 National Academy of Sciences. All rights reserved. While deep learning is successful in a num...
Deep networks are usually trained and tested in a regime in which the training classification error ...
[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we...
How to train deep neural networks (DNNs) to generalize well is a central concern in deep learning, e...
The landscape of the empirical risk of overparametrized deep convolutional neural networks (DCNNs) i...
In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolut...
The classical statistical learning theory implies that fitting too many parameters leads to overfitt...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
In the last decade or so, deep learning has revolutionized entire domains of machine learning. Neura...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
In this paper, we prove a conjecture published in 1989 and also partially address an open problem an...
The remarkable practical success of deep learning has revealed some major surprises from a theoretic...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
© 2020 National Academy of Sciences. All rights reserved. While deep learning is successful in a num...
Deep networks are usually trained and tested in a regime in which the training classification error ...
[previously titled "Theory of Deep Learning III: Generalization Properties of SGD"] In Theory III we...
How to train deep neural networks (DNNs) to generalize well is a central concern in deep learning, e...
The landscape of the empirical risk of overparametrized deep convolutional neural networks (DCNNs) i...
In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolut...
The classical statistical learning theory implies that fitting too many parameters leads to overfitt...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
In the last decade or so, deep learning has revolutionized entire domains of machine learning. Neura...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
In this paper, we prove a conjecture published in 1989 and also partially address an open problem an...
The remarkable practical success of deep learning has revealed some major surprises from a theoretic...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...