Stochastic gradient descent (SGD) and its variants are the main workhorses for solving large-scale optimization problems with nonconvex objective functions. Although the convergence of SGDs in the (strongly) convex case is well-understood, their convergence for nonconvex functions stands on weak mathematical foundations. Most existing studies on the nonconvex convergence of SGD show the complexity results based on either the minimum of the expected gradient norm or the functional sub-optimality gap (for functions with extra structural property) by searching the entire range of iterates. Hence the last iterations of SGDs do not necessarily maintain the same complexity guarantee. This paper shows that an $\epsilon$-stationary point exists in ...
We study the complexity of finding the global solution to stochastic nonconvex optimization when the...
Recently, Loizou et al. (2021), proposed and analyzed stochastic gradient descent (SGD) with stochas...
The subject of this thesis is the analysis of several stochastic algorithms in a nonconvex setting. ...
In machine learning, stochastic gradient descent (SGD) is widely deployed to train models using high...
94 pages, 4 figuresThis paper proposes a thorough theoretical analysis of Stochastic Gradient Descen...
International audienceIn this paper, we examine a class of nonconvex stochastic opti-mization proble...
We aim to make stochastic gradient descent (SGD) adaptive to (i) the noise $\sigma^2$ in the stochas...
SGD with Momentum (SGDM) is a widely used family of algorithms for large-scale optimization of machi...
Stochastic Gradient Descent (SGD) type algorithms have been widely applied to many stochastic optimi...
Stochastic gradient descent (SGD) is a sim-ple and popular method to solve stochas-tic optimization ...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
We consider the minimization of an objective function given access to unbiased estimates of its grad...
We consider the optimization of a smooth and strongly convex objective using constant step-size stoc...
Understanding the convergence performance of asynchronous stochastic gradient descent method (Async-...
Stochastic gradient descent (SGD) is a promising numerical method for solving large-scale inverse pr...
We study the complexity of finding the global solution to stochastic nonconvex optimization when the...
Recently, Loizou et al. (2021), proposed and analyzed stochastic gradient descent (SGD) with stochas...
The subject of this thesis is the analysis of several stochastic algorithms in a nonconvex setting. ...
In machine learning, stochastic gradient descent (SGD) is widely deployed to train models using high...
94 pages, 4 figuresThis paper proposes a thorough theoretical analysis of Stochastic Gradient Descen...
International audienceIn this paper, we examine a class of nonconvex stochastic opti-mization proble...
We aim to make stochastic gradient descent (SGD) adaptive to (i) the noise $\sigma^2$ in the stochas...
SGD with Momentum (SGDM) is a widely used family of algorithms for large-scale optimization of machi...
Stochastic Gradient Descent (SGD) type algorithms have been widely applied to many stochastic optimi...
Stochastic gradient descent (SGD) is a sim-ple and popular method to solve stochas-tic optimization ...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
We consider the minimization of an objective function given access to unbiased estimates of its grad...
We consider the optimization of a smooth and strongly convex objective using constant step-size stoc...
Understanding the convergence performance of asynchronous stochastic gradient descent method (Async-...
Stochastic gradient descent (SGD) is a promising numerical method for solving large-scale inverse pr...
We study the complexity of finding the global solution to stochastic nonconvex optimization when the...
Recently, Loizou et al. (2021), proposed and analyzed stochastic gradient descent (SGD) with stochas...
The subject of this thesis is the analysis of several stochastic algorithms in a nonconvex setting. ...