We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise known as the momentum method) for the general stochastic approximation problem. For SGD, in the convex and smooth setting, we provide the first almost sure asymptotic convergence rates for a weighted average of the iterates. More precisely, we show that the convergence rate of the function values is arbitrarily close to o(1/ √ k), and is exactly o(1/k) in the so-called overparametrized case. We show that these results still hold when using stochastic line search and stochastic Polyak stepsizes, thereby giving the first proof of convergence of these methods in the non-overparametrized regime. Using a substantially different analysis, we show that ...
An usual problem in statistics consists in estimating the minimizer of a convex function. When we ha...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise know...
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise know...
The vast majority of convergence rates analysis for stochastic gradient methods in the literature fo...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-b...
This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-b...
International audienceThis paper deals with a natural stochastic optimization procedure derived from...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
An usual problem in statistics consists in estimating the minimizer of a convex function. When we ha...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...
An usual problem in statistics consists in estimating the minimizer of a convex function. When we ha...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise know...
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise know...
The vast majority of convergence rates analysis for stochastic gradient methods in the literature fo...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-b...
This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-b...
International audienceThis paper deals with a natural stochastic optimization procedure derived from...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
An usual problem in statistics consists in estimating the minimizer of a convex function. When we ha...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...
An usual problem in statistics consists in estimating the minimizer of a convex function. When we ha...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...
International audienceIn this paper, a general stochastic optimization procedure is studied, unifyin...