In empirical risk optimization, it has been observed that stochastic gradient implementations that rely on random reshuffling of the data achieve better performance than implementations that rely on sampling the data uniformly. Recent works have pursued justifications for this behavior by examining the convergence rate of the learning process under diminishing step sizes. This work focuses on the constant step-size case and strongly convex loss functions. In this case, convergence is guaranteed to a small neighborhood of the optimizer albeit at a linear rate. The analysis establishes analytically that random reshuffling outperforms uniform sampling by showing explicitly that iterates approach a smaller neighborhood of size O(mu(2)) around t...
Noise is inherited in many optimization methods such as stochastic gradient methods, zeroth-order me...
Gradient compression is a popular technique for improving communication complexity of stochastic fir...
International audienceThis paper studies the asymptotic behavior of the constant step Stochastic Gra...
In empirical risk optimization, it has been observed that gradient descent implementations that rely...
This dissertation focuses on stochastic gradient learning for problems involving large data sets or ...
This dissertation focuses on stochastic gradient learning for problems involving large data sets or ...
Abstract We analyze the convergence rate of the random reshuffling (RR) method, which...
We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax opt...
Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochasti...
Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochasti...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
Abstract. Stochastic-approximation gradient methods are attractive for large-scale convex optimizati...
Current machine learning practice requires solving huge-scale empirical risk minimization problems q...
Current machine learning practice requires solving huge-scale empirical risk minimization problems q...
Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which i...
Noise is inherited in many optimization methods such as stochastic gradient methods, zeroth-order me...
Gradient compression is a popular technique for improving communication complexity of stochastic fir...
International audienceThis paper studies the asymptotic behavior of the constant step Stochastic Gra...
In empirical risk optimization, it has been observed that gradient descent implementations that rely...
This dissertation focuses on stochastic gradient learning for problems involving large data sets or ...
This dissertation focuses on stochastic gradient learning for problems involving large data sets or ...
Abstract We analyze the convergence rate of the random reshuffling (RR) method, which...
We analyze the convergence rates of stochastic gradient algorithms for smooth finite-sum minimax opt...
Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochasti...
Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochasti...
We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" lear...
Abstract. Stochastic-approximation gradient methods are attractive for large-scale convex optimizati...
Current machine learning practice requires solving huge-scale empirical risk minimization problems q...
Current machine learning practice requires solving huge-scale empirical risk minimization problems q...
Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which i...
Noise is inherited in many optimization methods such as stochastic gradient methods, zeroth-order me...
Gradient compression is a popular technique for improving communication complexity of stochastic fir...
International audienceThis paper studies the asymptotic behavior of the constant step Stochastic Gra...