Deep learning has transformed computer vision, natural language processing, and speech recognition. However, two critical questions remain obscure: (1) why do deep neural networks generalize better than shallow networks; and (2) does it always hold that a deeper network leads to better performance? In this thesis, we derive an upper bound on the expected generalization error for neural networks based on information theory. This upper bound shows that as the number of layers increases in the network, the expected generalization error will decrease exponentially to zero. Layers with strict information loss, such as the convolutional or pooling layers, reduce the generalization error for the whole network; this answers the first question. Howe...
The classical statistical learning theory implies that fitting too many parameters leads to overfitt...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
With a direct analysis of neural networks, this paper presents a mathematically tight generalization...
This paper provides theoretical insights into why and how deep learning can generalize well, despite...
Deep networks are usually trained and tested in a regime in which the training classification error ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
Although recent works have brought some insights into the performance improvement of techniques used...
In the last decade or so, deep learning has revolutionized entire domains of machine learning. Neura...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
In this work, we construct generalization bounds to understand existing learning algorithms and prop...
One of the important challenges today in deep learning is explaining the outstanding power of genera...
This is the final version. Available from ICLR via the link in this recordDeep neural networks (DNNs...
One of the important challenges today in deep learning is explaining the outstanding power of genera...
The classical statistical learning theory implies that fitting too many parameters leads to overfitt...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
With a direct analysis of neural networks, this paper presents a mathematically tight generalization...
This paper provides theoretical insights into why and how deep learning can generalize well, despite...
Deep networks are usually trained and tested in a regime in which the training classification error ...
The generalization mystery in deep learning is the following: Why do over-parameterized neural netwo...
Although recent works have brought some insights into the performance improvement of techniques used...
In the last decade or so, deep learning has revolutionized entire domains of machine learning. Neura...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
In this work, we construct generalization bounds to understand existing learning algorithms and prop...
One of the important challenges today in deep learning is explaining the outstanding power of genera...
This is the final version. Available from ICLR via the link in this recordDeep neural networks (DNNs...
One of the important challenges today in deep learning is explaining the outstanding power of genera...
The classical statistical learning theory implies that fitting too many parameters leads to overfitt...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...
Modern deep neural networks (DNNs) represent a formidable challenge for theorists: according to the ...