Stochastic Gradient Descent (SGD) is the workhorse beneath the deep learning revolution. However, SGD is known to reduce its convergence speed due to the plateau phenomenon. Stochastic Natural Gradient Descent (SNGD) was proposed by Amari to resolve that problem by taking benefit of the geometry of the space. Nevertheless, the convergence of SNGD is not guaranteed. The aim of this article is to modify SNGD to obtain a convergent variant, that we name Convergent SNGD (CSNGD), and test it in a specific toy optimization problem. In particular, we concentrate on the problem of learning a discrete probability distribution. Based on variable metric convergence results presented by Sunehag et al. [13], we prove the convergence of CSNGD. Furt...
The goal of this paper is to debunk and dispel the magic behind black-box optimizers and stochastic ...
We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAG...
We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAG...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
The vast majority of convergence rates analysis for stochastic gradient methods in the literature fo...
International audienceAn increasing number of machine learning problems, such as robust or adversari...
We design step-size schemes that make stochastic gradient descent (SGD) adaptive to (i) the noise σ ...
International audienceAn increasing number of machine learning problems, such as robust or adversari...
© Springer International Publishing AG 2016. The convergence of Stochastic Gradient Descent (SGD) us...
In machine learning, stochastic gradient descent (SGD) is widely deployed to train models using high...
Stochastic Gradient Descent algorithms (SGD) remain a popular optimizer for deep learning networks a...
In the age of artificial intelligence, the best approach to handling huge amounts of data is a treme...
Stochastic gradient descent (SGD) is a sim-ple and popular method to solve stochas-tic optimization ...
Stochastic gradient descent (SGD) holds as a classical method to build large scale machine learning ...
Large-scale learning problems require algorithms that scale benignly with respect to the size of the...
The goal of this paper is to debunk and dispel the magic behind black-box optimizers and stochastic ...
We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAG...
We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAG...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
The vast majority of convergence rates analysis for stochastic gradient methods in the literature fo...
International audienceAn increasing number of machine learning problems, such as robust or adversari...
We design step-size schemes that make stochastic gradient descent (SGD) adaptive to (i) the noise σ ...
International audienceAn increasing number of machine learning problems, such as robust or adversari...
© Springer International Publishing AG 2016. The convergence of Stochastic Gradient Descent (SGD) us...
In machine learning, stochastic gradient descent (SGD) is widely deployed to train models using high...
Stochastic Gradient Descent algorithms (SGD) remain a popular optimizer for deep learning networks a...
In the age of artificial intelligence, the best approach to handling huge amounts of data is a treme...
Stochastic gradient descent (SGD) is a sim-ple and popular method to solve stochas-tic optimization ...
Stochastic gradient descent (SGD) holds as a classical method to build large scale machine learning ...
Large-scale learning problems require algorithms that scale benignly with respect to the size of the...
The goal of this paper is to debunk and dispel the magic behind black-box optimizers and stochastic ...
We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAG...
We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAG...