Replica exchange stochastic gradient Langevin dynamics (reSGLD) has shown promise in accelerating the convergence in non-convex learning; however, an excessively large correction for avoiding biases from noisy energy estimators has limited the potential of the acceleration. To address this issue, we study the variance reduction for noisy energy estimators, which promotes much more effective swaps. Theoretically, we provide a non-asymptotic analysis on the exponential acceleration for the underlying continuous-time Markov jump process; moreover, we consider a generalized Girsanov theorem which includes the change of Poisson measure to overcome the crude discretization based on the Gröwall's inequality and yields a much tighter error in the 2...
International audienceStochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its ...
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
Recently, Stochastic Gradient Descent (SGD) and its variants have become the dominant methods in the...
Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), employ fast ...
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve samplin...
Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally e...
We implement the simple method to accelerate the convergence speed to the steady state and enhance t...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for q...
International audienceStochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC algorit...
Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally e...
In empirical risk optimization, it has been observed that stochastic gradient implementations that r...
We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with const...
We propose an adaptively weighted stochastic gradient Langevin dynamics algorithm (SGLD), so-called ...
In this paper, we explore a general Aggregated Gradient Langevin Dynamics framework (AGLD) for the M...
This dissertation focuses on stochastic gradient learning for problems involving large data sets or ...
International audienceStochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its ...
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
Recently, Stochastic Gradient Descent (SGD) and its variants have become the dominant methods in the...
Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), employ fast ...
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve samplin...
Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally e...
We implement the simple method to accelerate the convergence speed to the steady state and enhance t...
We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for q...
International audienceStochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC algorit...
Applying standard Markov chain Monte Carlo (MCMC) algorithms to large data sets is computationally e...
In empirical risk optimization, it has been observed that stochastic gradient implementations that r...
We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with const...
We propose an adaptively weighted stochastic gradient Langevin dynamics algorithm (SGLD), so-called ...
In this paper, we explore a general Aggregated Gradient Langevin Dynamics framework (AGLD) for the M...
This dissertation focuses on stochastic gradient learning for problems involving large data sets or ...
International audienceStochastic Gradient Descent (SGD) is a workhorse in machine learning, yet its ...
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
Recently, Stochastic Gradient Descent (SGD) and its variants have become the dominant methods in the...