This paper provides a framework to analyze stochastic gradient algorithms in a mean squared error (MSE) sense using the asymptotic normality result of the stochastic gradient descent (SGD) iterates. We perform this analysis by taking the asymptotic normality result and applying it to the finite iteration case. Specifically, we look at problems where the gradient estimators are biased and have reduced variance and compare the iterates generated by these gradient estimators to the iterates generated by the SGD algorithm. We use the work of Fabian to characterize the mean and the variance of the distribution of the iterates in terms of the bias and the covariance matrix of the gradient estimators. We introduce the sliding window SGD (SW-SGD) a...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for q...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
In this thesis we want to give a theoretical and practical introduction to stochastic gradient desce...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
With the purpose of examining biased updates in variance-reduced stochastic gradient methods, we int...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
Stochastic variance reduced gradient (SVRG) is a popular variance reduction technique for accelerati...
Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally e...
Stochastic gradient descent (SGD) is a sim-ple and popular method to solve stochas-tic optimization ...
237 pagesIt seems that in the current age, computers, computation, and data have an increasingly imp...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
In this dissertation, we propose two new types of stochastic approximation (SA) methods and study th...
The field of statistical machine learning has seen a rapid progress in complex hierarchical Bayesian...
The asymptotic behavior of the stochastic gradient algorithm using biased gradient estimates is anal...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for q...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
In this thesis we want to give a theoretical and practical introduction to stochastic gradient desce...
We develop the mathematical foundations of the stochastic modified equations (SME) framework for ana...
With the purpose of examining biased updates in variance-reduced stochastic gradient methods, we int...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization pr...
Stochastic variance reduced gradient (SVRG) is a popular variance reduction technique for accelerati...
Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally e...
Stochastic gradient descent (SGD) is a sim-ple and popular method to solve stochas-tic optimization ...
237 pagesIt seems that in the current age, computers, computation, and data have an increasingly imp...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
In this dissertation, we propose two new types of stochastic approximation (SA) methods and study th...
The field of statistical machine learning has seen a rapid progress in complex hierarchical Bayesian...
The asymptotic behavior of the stochastic gradient algorithm using biased gradient estimates is anal...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for q...
International audienceRecent studies have provided both empirical and theoretical evidence illustrat...
In this thesis we want to give a theoretical and practical introduction to stochastic gradient desce...