We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for quadratic non-strongly-convex problems may be reformulated as constant parameter secondorder difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n2), where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system, showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggests an alternative algorithm (i.e., with different step sizes) that...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
Replica exchange stochastic gradient Langevin dynamics (reSGLD) has shown promise in accelerating th...
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve samplin...
International audienceWe show that accelerated gradient descent, averaged gradient descent and the h...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for q...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
Gradient-based optimization algorithms, in particular their stochastic counterparts, have become by ...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise know...
We consider the fundamental problem in nonconvex optimization of efficiently reaching a stationary p...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
This paper provides a framework to analyze stochastic gradient algorithms in a mean squared error (M...
Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessa...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
Replica exchange stochastic gradient Langevin dynamics (reSGLD) has shown promise in accelerating th...
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve samplin...
International audienceWe show that accelerated gradient descent, averaged gradient descent and the h...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for q...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
Gradient-based optimization algorithms, in particular their stochastic counterparts, have become by ...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
We study stochastic gradient descent (SGD) and the stochastic heavy ball method (SHB, otherwise know...
We consider the fundamental problem in nonconvex optimization of efficiently reaching a stationary p...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
This paper provides a framework to analyze stochastic gradient algorithms in a mean squared error (M...
Consider the problem of minimizing functions that are Lipschitz and strongly convex, but not necessa...
With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm...
Replica exchange stochastic gradient Langevin dynamics (reSGLD) has shown promise in accelerating th...
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve samplin...