94 pages, 4 figuresThis paper proposes a thorough theoretical analysis of Stochastic Gradient Descent (SGD) with non-increasing step sizes. First, we show that the recursion defining SGD can be provably approximated by solutions of a time inhomogeneous Stochastic Differential Equation (SDE) using an appropriate coupling. In the specific case of a batch noise we refine our results using recent advances in Stein's method. Then, motivated by recent analyses of deterministic and stochastic optimization methods by their continuous counterpart, we study the long-time behavior of the continuous processes at hand and establish non-asymptotic bounds. To that purpose, we develop new comparison techniques which are of independent interest. Adapting th...
The vast majority of convergence rates analysis for stochastic gradient methods in the literature fo...
International audienceStochastic gradient descent (SGD) has been widely used in machine learning due...
We analyze the global and local behavior of gradient-like flows under stochastic errors towards the ...
We propose new continuous-time formulations for first-order stochastic optimization algorithms such ...
In this article, a family of SDEs are derived as a tool to understand the behavior of numerical opti...
In this article, a family of SDEs are derived as a tool to understand the behavior of numerical opti...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
The gradient noise of Stochastic Gradient Descent (SGD) is considered to play a key role in its prop...
We design step-size schemes that make stochastic gradient descent (SGD) adaptive to (i) the noise σ ...
In this thesis we want to give a theoretical and practical introduction to stochastic gradient desce...
We develop a new continuous-time stochastic gradient descent method for optimizing over the stationa...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
Recently, Loizou et al. (2021), proposed and analyzed stochastic gradient descent (SGD) with stochas...
The vast majority of convergence rates analysis for stochastic gradient methods in the literature fo...
International audienceStochastic gradient descent (SGD) has been widely used in machine learning due...
We analyze the global and local behavior of gradient-like flows under stochastic errors towards the ...
We propose new continuous-time formulations for first-order stochastic optimization algorithms such ...
In this article, a family of SDEs are derived as a tool to understand the behavior of numerical opti...
In this article, a family of SDEs are derived as a tool to understand the behavior of numerical opti...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
The gradient noise of Stochastic Gradient Descent (SGD) is considered to play a key role in its prop...
We design step-size schemes that make stochastic gradient descent (SGD) adaptive to (i) the noise σ ...
In this thesis we want to give a theoretical and practical introduction to stochastic gradient desce...
We develop a new continuous-time stochastic gradient descent method for optimizing over the stationa...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
Recently, Loizou et al. (2021), proposed and analyzed stochastic gradient descent (SGD) with stochas...
The vast majority of convergence rates analysis for stochastic gradient methods in the literature fo...
International audienceStochastic gradient descent (SGD) has been widely used in machine learning due...
We analyze the global and local behavior of gradient-like flows under stochastic errors towards the ...