<p>Stochastic gradient optimization is a class of widely used algorithms for training machine learning models. To optimize an objective, it uses the noisy gradient computed from the random data samples instead of the true gradient computed from the entire dataset. However, when the variance of the noisy gradient is large, the algorithm might spend much time bouncing around, leading to slower convergence and worse performance. In this paper, we develop a general approach of using control variate for variance reduction in stochastic gradient. Data statistics such as low-order moments (pre-computed or estimated online) is used to form the control variate. We demonstrate how to construct the control variate for two practical problems using stoc...
International audienceStochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its ...
This work considers optimization methods for large-scale machine learning (ML). Optimization in ML ...
Stochastic gradient MCMC (SGMCMC) offers a scalable alternative to traditional MCMC, by constructing...
Stochastic gradient optimization is a class of widely used algorithms for training machine learning ...
The field of statistical machine learning has seen a rapid progress in complex hierarchical Bayesian...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
We consider the fundamental problem in nonconvex optimization of efficiently reaching a stationary p...
237 pagesIt seems that in the current age, computers, computation, and data have an increasingly imp...
In this paper, we propose a novel reinforcement-learning algorithm consisting in a stochastic varian...
Gradient estimation -- approximating the gradient of an expectation with respect to the parameters o...
Learning models with discrete latent variables using stochastic gradient descent remains a challenge...
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve samplin...
Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparame...
17 pages, 2 figures, 1 tableInternational audienceOur goal is to improve variance reducing stochasti...
International audienceStochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its ...
This work considers optimization methods for large-scale machine learning (ML). Optimization in ML ...
Stochastic gradient MCMC (SGMCMC) offers a scalable alternative to traditional MCMC, by constructing...
Stochastic gradient optimization is a class of widely used algorithms for training machine learning ...
The field of statistical machine learning has seen a rapid progress in complex hierarchical Bayesian...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
Stochastic gradient descent is popular for large scale optimization but has slow convergence asympto...
We consider the fundamental problem in nonconvex optimization of efficiently reaching a stationary p...
237 pagesIt seems that in the current age, computers, computation, and data have an increasingly imp...
In this paper, we propose a novel reinforcement-learning algorithm consisting in a stochastic varian...
Gradient estimation -- approximating the gradient of an expectation with respect to the parameters o...
Learning models with discrete latent variables using stochastic gradient descent remains a challenge...
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve samplin...
Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparame...
17 pages, 2 figures, 1 tableInternational audienceOur goal is to improve variance reducing stochasti...
International audienceStochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its ...
This work considers optimization methods for large-scale machine learning (ML). Optimization in ML ...
Stochastic gradient MCMC (SGMCMC) offers a scalable alternative to traditional MCMC, by constructing...