The properties of flat minima in the empirical risk landscape of neural networks have been debated for some time. Increasing evidence suggests they possess better generalization capabilities with respect to sharp ones. In this work we first discuss the relationship between alternative measures of flatness: The local entropy, which is useful for analysis and algorithm development, and the local energy, which is easier to compute and was shown empirically in extensive tests on state-of-the-art networks to be the best predictor of generalization capabilities. We show semi-analytically in simple controlled scenarios that these two measures correlate strongly with each other and with generalization. Then, we extend the analysis to the deep learn...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
While significant theoretical progress has been achieved, unveiling the generalization mystery of ov...
The properties of flat minima in the empirical risk landscape of neural networks have been debated f...
This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural network...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
The success of deep learning has revealed the application potential of neural networks across the sc...
The classical statistical learning theory implies that fitting too many parameters leads to overfitt...
We derive generalization and excess risk bounds for neural networks using a family of complexity mea...
In the last decade or so, deep learning has revolutionized entire domains of machine learning. Neura...
In this work, we construct generalization bounds to understand existing learning algorithms and prop...
How to train deep neural networks (DNNs) to generalize well is a central concern in deep learning, e...
The landscape of the empirical risk of overparametrized deep convolutional neural networks (DCNNs) i...
Although neural networks can solve very complex machine-learning problems, the theoretical reason fo...
When humans learn a new concept, they might ignore examples that they cannot make sense of at first,...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
While significant theoretical progress has been achieved, unveiling the generalization mystery of ov...
The properties of flat minima in the empirical risk landscape of neural networks have been debated f...
This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural network...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
The success of deep learning has revealed the application potential of neural networks across the sc...
The classical statistical learning theory implies that fitting too many parameters leads to overfitt...
We derive generalization and excess risk bounds for neural networks using a family of complexity mea...
In the last decade or so, deep learning has revolutionized entire domains of machine learning. Neura...
In this work, we construct generalization bounds to understand existing learning algorithms and prop...
How to train deep neural networks (DNNs) to generalize well is a central concern in deep learning, e...
The landscape of the empirical risk of overparametrized deep convolutional neural networks (DCNNs) i...
Although neural networks can solve very complex machine-learning problems, the theoretical reason fo...
When humans learn a new concept, they might ignore examples that they cannot make sense of at first,...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
While significant theoretical progress has been achieved, unveiling the generalization mystery of ov...