We propose new continuous-time formulations for first-order stochastic optimization algorithms such as mini-batch gradient descent and variance-reduced methods. We exploit these continuous-time models, together with simple Lyapunov analysis as well as tools from stochastic calculus, in order to derive convergence bounds for various types of non-convex functions. Guided by such analysis, we show that the same Lyapunov arguments hold in discrete-time, leading to matching rates. In addition, we use these models and Ito calculus to infer novel insights on the dynamics of SGD, proving that a decreasing learning rate acts as time warping or, equivalently, as landscape stretching
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
We develop a new continuous-time stochastic gradient descent method for optimizing over the stationa...
We study stochastic algorithms in a streaming framework, trained on samples coming from a dependent ...
First-order methods are often analyzed via their continuous-time models, where their worst-case conv...
94 pages, 4 figuresThis paper proposes a thorough theoretical analysis of Stochastic Gradient Descen...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
First-order methods are often analyzed via their continuous-time models, where their worst-case conv...
Optimization problems with continuous data appear in, e.g., robust machine learning, functional data...
We consider the problem of policy evaluation for continuous-time processes using the temporal-differ...
In this article, a family of SDEs are derived as a tool to understand the behavior of numerical opti...
We consider stochastic optimization problems where data is drawn from a Markov chain. Existing metho...
Approaches like finite differences with common random numbers, infinitesimal perturbation analysis, ...
We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradi...
Optimization is among the richest modeling languages in science. In statistics and machine learning,...
While the design of algorithms is traditionally a discrete endeavour, in recent years many advances ...
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
We develop a new continuous-time stochastic gradient descent method for optimizing over the stationa...
We study stochastic algorithms in a streaming framework, trained on samples coming from a dependent ...
First-order methods are often analyzed via their continuous-time models, where their worst-case conv...
94 pages, 4 figuresThis paper proposes a thorough theoretical analysis of Stochastic Gradient Descen...
Abstract: Stochastic gradient descent is an optimisation method that combines classical gradient des...
First-order methods are often analyzed via their continuous-time models, where their worst-case conv...
Optimization problems with continuous data appear in, e.g., robust machine learning, functional data...
We consider the problem of policy evaluation for continuous-time processes using the temporal-differ...
In this article, a family of SDEs are derived as a tool to understand the behavior of numerical opti...
We consider stochastic optimization problems where data is drawn from a Markov chain. Existing metho...
Approaches like finite differences with common random numbers, infinitesimal perturbation analysis, ...
We prove the convergence to minima and estimates on the rate of convergence for the stochastic gradi...
Optimization is among the richest modeling languages in science. In statistics and machine learning,...
While the design of algorithms is traditionally a discrete endeavour, in recent years many advances ...
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
We develop a new continuous-time stochastic gradient descent method for optimizing over the stationa...
We study stochastic algorithms in a streaming framework, trained on samples coming from a dependent ...