The practice of deep learning has shown that neural networks generalize remarkably well even with an extreme number of learned parameters. This appears to contradict traditional statistical wisdom, in which a trade-off between model complexity and fit to the data is essential. We set out to resolve this discrepancy from a convex optimization and sparse recovery perspective. We consider the training and generalization properties of two-layer ReLU networks with standard weight decay regularization. Under certain regularity assumptions on the data, we show that ReLU networks with an arbitrary number of parameters learn only simple models that explain the data. This is analogous to the recovery of the sparsest linear model in compressed sensing...
Classifiers used in the wild, in particular for safety-critical systems, should not only have good g...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
We develop fast algorithms and robust software for convex optimization of two-layer neural networks ...
International audienceGiven a training set, a loss function, and a neural network architecture, it i...
Today, various forms of neural networks are trained to perform approximation tasks in many fields. H...
Understanding the fundamental principles behind the success of deep neural networks is one of the mo...
Understanding the computational complexity of training simple neural networks with rectified linear ...
In this paper, we consider one dimensional (shallow) ReLU neural networks in which weights are chose...
Convex $\ell_1$ regularization using an infinite dictionary of neurons has been suggested for constr...
Current deep neural networks are highly overparameterized (up to billions of connection weights) and...
Rectified linear units (ReLUs) have become the main model for the neural units in current deep learn...
In this note, we study how neural networks with a single hidden layer and ReLU activation interpolat...
In deep learning it is common to overparameterize neural networks, that is, to use more parameters t...
Neural networks (NNs) have seen a surge in popularity due to their unprecedented practical success i...
Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization...
Classifiers used in the wild, in particular for safety-critical systems, should not only have good g...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
We develop fast algorithms and robust software for convex optimization of two-layer neural networks ...
International audienceGiven a training set, a loss function, and a neural network architecture, it i...
Today, various forms of neural networks are trained to perform approximation tasks in many fields. H...
Understanding the fundamental principles behind the success of deep neural networks is one of the mo...
Understanding the computational complexity of training simple neural networks with rectified linear ...
In this paper, we consider one dimensional (shallow) ReLU neural networks in which weights are chose...
Convex $\ell_1$ regularization using an infinite dictionary of neurons has been suggested for constr...
Current deep neural networks are highly overparameterized (up to billions of connection weights) and...
Rectified linear units (ReLUs) have become the main model for the neural units in current deep learn...
In this note, we study how neural networks with a single hidden layer and ReLU activation interpolat...
In deep learning it is common to overparameterize neural networks, that is, to use more parameters t...
Neural networks (NNs) have seen a surge in popularity due to their unprecedented practical success i...
Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization...
Classifiers used in the wild, in particular for safety-critical systems, should not only have good g...
Large neural networks have proved remarkably effective in modern deep learning practice, even in the...
We develop fast algorithms and robust software for convex optimization of two-layer neural networks ...