Recently, several studies have proven the global convergence and generalization abilities of the gradient descent method for two-layer ReLU networks. Most studies especially focused on the regression problems with the squared loss function, except for a few, and the importance of the positivity of the neural tangent kernel has been pointed out. On the other hand, the performance of gradient descent on classification problems using the logistic loss function has not been well studied, and further investigation of this problem structure is possible. In this work, we demonstrate that the separability assumption using a neural tangent model is more reasonable than the positivity condition of the neural tangent kernel and provide a refined conve...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks ...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based met...
Normalized gradient descent has shown substantial success in speeding up the convergence of exponen...
International audienceMany supervised machine learning methods are naturally cast as optimization pr...
Deep learning has become an important toolkit for data science and artificial intelligence. In contr...
Deep learning has become an important toolkit for data science and artificial intelligence. In contr...
International audienceNeural networks trained to minimize the logistic (a.k.a. cross-entropy) loss w...
The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, c...
In this thesis, we theoretically analyze the ability of neural networks trained by gradient descent ...
Neural networks can be trained to solve regression problems by using gradient-based methods to minim...
Neural networks have achieved remarkable empirical performance, while the current theoretical analys...
International audienceNeural networks can be trained to solve regression problems by using gradient-...
Neural networks are very successful tools in for example advanced classification. From a statistical...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks ...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based met...
Normalized gradient descent has shown substantial success in speeding up the convergence of exponen...
International audienceMany supervised machine learning methods are naturally cast as optimization pr...
Deep learning has become an important toolkit for data science and artificial intelligence. In contr...
Deep learning has become an important toolkit for data science and artificial intelligence. In contr...
International audienceNeural networks trained to minimize the logistic (a.k.a. cross-entropy) loss w...
The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, c...
In this thesis, we theoretically analyze the ability of neural networks trained by gradient descent ...
Neural networks can be trained to solve regression problems by using gradient-based methods to minim...
Neural networks have achieved remarkable empirical performance, while the current theoretical analys...
International audienceNeural networks can be trained to solve regression problems by using gradient-...
Neural networks are very successful tools in for example advanced classification. From a statistical...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...
Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks ...
Empirical studies show that gradient-based methods can learn deep neural networks (DNNs) with very g...