We study the diversity of the features learned by a two-layer neural network trained with the least squares loss. We measure the diversity by the average L2-distance between the hidden-layer features and theoretically investigate how learning non-redundant distinct features affects the performance of the network. To do so, we derive novel generalization bounds depending on feature diversity based on Rademacher complexity for such networks. Our analysis proves that more distinct features at the network’s units within the hidden layer lead to better generalization. We also show how to extend our results to deeper networks and different losses.This work has been supported by the NSF-Business Fin- land Center for Visual and Decision Informatics...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
The success of deep learning has revealed the application potential of neural networks across the sc...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...
Artificial Neural Networks (ANN) are biologically inspired algorithms, and it is natural that it con...
Abstract Many deep neural networks trained on natural images exhibit a curious phenomenon in common:...
In this work, we provide a characterization of the feature-learning process in two-layer ReLU networ...
Abstract Many deep neural networks trained on natural images exhibit a curious phenomenon in common:...
International audienceTraditional deep learning algorithms often fail to generalize when they are te...
Neural networks have achieved remarkable empirical performance, while the current theoretical analys...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
The success of deep learning has revealed the application potential of neural networks across the sc...
Many deep neural networks trained on natural images exhibit a curious phe-nomenon in common: on the ...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
The success of deep learning has revealed the application potential of neural networks across the sc...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained...
Deep learning has transformed computer vision, natural language processing, and speech recognition. ...
Artificial Neural Networks (ANN) are biologically inspired algorithms, and it is natural that it con...
Abstract Many deep neural networks trained on natural images exhibit a curious phenomenon in common:...
In this work, we provide a characterization of the feature-learning process in two-layer ReLU networ...
Abstract Many deep neural networks trained on natural images exhibit a curious phenomenon in common:...
International audienceTraditional deep learning algorithms often fail to generalize when they are te...
Neural networks have achieved remarkable empirical performance, while the current theoretical analys...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
The success of deep learning has revealed the application potential of neural networks across the sc...
Many deep neural networks trained on natural images exhibit a curious phe-nomenon in common: on the ...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...
The success of deep learning has revealed the application potential of neural networks across the sc...
Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss functio...