We provide the first theoretical analysis on the convergence rate of asynchronous mini-batch gradient descent with variance reduction (AsySVRG) for non-convex optimization. Asynchronous stochastic gradient descent (AsySGD) has been broadly used for deep learning optimization, and it is proved to converge with rate of O(1/\sqrt{T}) for non-convex optimization. Recently, variance reduction technique is proposed and it is proved to be able to accelerate the convergence of SGD greatly. It is shown that asynchronous SGD method with variance reduction technique has linear convergence rate when problem is strongly convex. However, there is still no work to analyze the convergence rate of this method for non-convex problem. In this paper, we con...
Stochastic Gradient Descent (SGD) is very useful in optimization problems with high-dimensional non-...
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the o...
International audienceThe existing analysis of asynchronous stochastic gradient descent (SGD) degrad...
Nowadays, asynchronous parallel algorithms have received much attention in the optimization field du...
We consider the fundamental problem in nonconvex optimization of efficiently reaching a stationary p...
In machine learning research, many emerging applications can be (re)formulated as the composition op...
With the recent proliferation of large-scale learning problems, there have been a lot of interest o...
This thesis proposes and analyzes several first-order methods for convex optimization, designed for ...
Stochastic gradient descent (SGD) and its variants have become more and more popular in machine lear...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Mini-batch algorithms have been proposed as a way to speed-up stochastic convex optimization problem...
Stochastic Gradient Descent (SGD) is very useful in optimization problems with high-dimensional non-...
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the o...
International audienceThe existing analysis of asynchronous stochastic gradient descent (SGD) degrad...
Nowadays, asynchronous parallel algorithms have received much attention in the optimization field du...
We consider the fundamental problem in nonconvex optimization of efficiently reaching a stationary p...
In machine learning research, many emerging applications can be (re)formulated as the composition op...
With the recent proliferation of large-scale learning problems, there have been a lot of interest o...
This thesis proposes and analyzes several first-order methods for convex optimization, designed for ...
Stochastic gradient descent (SGD) and its variants have become more and more popular in machine lear...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Mini-batch algorithms have been proposed as a way to speed-up stochastic convex optimization problem...
Stochastic Gradient Descent (SGD) is very useful in optimization problems with high-dimensional non-...
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the o...
International audienceThe existing analysis of asynchronous stochastic gradient descent (SGD) degrad...