This paper analyzes the convergence and generalization of training a one-hidden-layer neural network when the input features follow the Gaussian mixture model consisting of a finite number of Gaussian distributions. Assuming the labels are generated from a teacher model with an unknown ground truth weight, the learning problem is to estimate the underlying teacher model by minimizing a non-convex risk function over a student neural network. With a finite number of training samples, referred to the sample complexity, the iterations are proved to converge linearly to a critical point with guaranteed generalization error. In addition, for the first time, this paper characterizes the impact of the input distributions on the sample complexity an...
In this study, we focus on feed-forward neural networks with a single hidden layer. The research tou...
International audienceDeep neural networks achieve stellar generalisation even when they have enough...
Abstract- We propose a novel learning algorithm to train networks with multi-layer linear-threshold ...
In this thesis, we theoretically analyze the ability of neural networks trained by gradient descent ...
Understanding the impact of data structure on the computational tractability of learning is a key ch...
International audienceMany supervised machine learning methods are naturally cast as optimization pr...
Neural networks (NNs) have seen a surge in popularity due to their unprecedented practical success i...
Deep neural networks achieve stellar generalisation on a variety of problems, despite often being la...
The accompanying code for this paper is available at https://github.com/sgoldt/gaussian-equiv-2layer...
We study the effect of regularization in an on-line gradient-descent learning scenario for a general...
By making assumptions on the probability distribution of the potentials in a feed-forward neural net...
In this article we study fully-connected feedforward deep ReLU ANNs with an arbitrarily large number...
The lack of crisp mathematical models that capture the structure of real-world data sets is a major ...
We present an analytic solution to the problem of on-line gradient-descent learning for two-layer ne...
We study the effect of regularization in an on-line gradient-descent learning scenario for a general...
In this study, we focus on feed-forward neural networks with a single hidden layer. The research tou...
International audienceDeep neural networks achieve stellar generalisation even when they have enough...
Abstract- We propose a novel learning algorithm to train networks with multi-layer linear-threshold ...
In this thesis, we theoretically analyze the ability of neural networks trained by gradient descent ...
Understanding the impact of data structure on the computational tractability of learning is a key ch...
International audienceMany supervised machine learning methods are naturally cast as optimization pr...
Neural networks (NNs) have seen a surge in popularity due to their unprecedented practical success i...
Deep neural networks achieve stellar generalisation on a variety of problems, despite often being la...
The accompanying code for this paper is available at https://github.com/sgoldt/gaussian-equiv-2layer...
We study the effect of regularization in an on-line gradient-descent learning scenario for a general...
By making assumptions on the probability distribution of the potentials in a feed-forward neural net...
In this article we study fully-connected feedforward deep ReLU ANNs with an arbitrarily large number...
The lack of crisp mathematical models that capture the structure of real-world data sets is a major ...
We present an analytic solution to the problem of on-line gradient-descent learning for two-layer ne...
We study the effect of regularization in an on-line gradient-descent learning scenario for a general...
In this study, we focus on feed-forward neural networks with a single hidden layer. The research tou...
International audienceDeep neural networks achieve stellar generalisation even when they have enough...
Abstract- We propose a novel learning algorithm to train networks with multi-layer linear-threshold ...