International audienceIn the context of supervised learning of a function by a neural network, we claim and empirically verify that the neural network yields better results when the distribution of the data set focuses on regions where the function to learn is steep. We first traduce this assumption in a mathematically workable way using Taylor expansion and emphasize a new training distribution based on the derivatives of the function to learn. Then, theoretical derivations allow construction of a methodology that we call variance based samples weighting (VBSW). VBSW uses labels' local variance to weight the training points. This methodology is general, scalable, cost-effective, and significantly increases the performances of a large class...
The bias/variance dilemma is addressed in the context of neural networks. A bias constraint based on...
We study the theory of neural network (NN) from the lens of classical nonparametric regression probl...
Modern machine learning models are complex, hierarchical, and large-scale and are trained using non-...
International audienceIn the context of supervised learning of a function by a neural network, we cl...
In the context of supervised learning of a function by a neural network, we claim and empirically ve...
Abstract We present weight normalization: a reparameterization of the weight vectors in a neural net...
A novel technique for deep learning of image classifiers is presented. The learned CNN models higher...
Training deep neural networks requires many training samples, but in practice, training labels are e...
The performance of deep learning (DL) models is highly dependent on the quality and size of the trai...
Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform techniqu...
31 pages, 14 figures, 11 tablesInternational audienceStandard neural networks struggle to generalize...
For many types of learners one can compute the statistically "op-timal " way to select dat...
Deep neural network training spends most of the computation on examples that are properly handled, a...
A new method of initializing the weights in deep neural networks is proposed. The method follows two...
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights ...
The bias/variance dilemma is addressed in the context of neural networks. A bias constraint based on...
We study the theory of neural network (NN) from the lens of classical nonparametric regression probl...
Modern machine learning models are complex, hierarchical, and large-scale and are trained using non-...
International audienceIn the context of supervised learning of a function by a neural network, we cl...
In the context of supervised learning of a function by a neural network, we claim and empirically ve...
Abstract We present weight normalization: a reparameterization of the weight vectors in a neural net...
A novel technique for deep learning of image classifiers is presented. The learned CNN models higher...
Training deep neural networks requires many training samples, but in practice, training labels are e...
The performance of deep learning (DL) models is highly dependent on the quality and size of the trai...
Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform techniqu...
31 pages, 14 figures, 11 tablesInternational audienceStandard neural networks struggle to generalize...
For many types of learners one can compute the statistically "op-timal " way to select dat...
Deep neural network training spends most of the computation on examples that are properly handled, a...
A new method of initializing the weights in deep neural networks is proposed. The method follows two...
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights ...
The bias/variance dilemma is addressed in the context of neural networks. A bias constraint based on...
We study the theory of neural network (NN) from the lens of classical nonparametric regression probl...
Modern machine learning models are complex, hierarchical, and large-scale and are trained using non-...