Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a strong potential in non-convex optimization, where local and global convergence guarantees can be shown under certain conditions. By building up on this recent theory, in this study, we develop an asynchronous-parallel stochastic L-BFGS algorithm for non-convex optimization. The proposed algorithm is suitable for both distributed and shared-memory settings. We provide formal theoretical analysis and show that the proposed method achieves an ergodic convergence rate of {equation Presented} (N being the total number of iterations) and it can achieve a linear speedup under certain conditions. We perform several experiments on both synthetic and ...
Abstract We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strong...
This thesis presents a parallel algorithm for non-convex large-scale stochastic optimization problem...
International audienceWe describe several features of parallel or distributed asynchronous iterative...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
We develop stochastic variants of the wellknown BFGS quasi-Newton optimization method, in both full ...
Stochastic gradient descent (SGD) and its variants have attracted much attention in machine learning...
Nowadays, asynchronous parallel algorithms have received much attention in the optimization field du...
We provide the first theoretical analysis on the convergence rate of asynchronous mini-batch gradie...
In this paper, a stochastic quasi-Newton algorithm for nonconvex stochastic optimization is presente...
Stochastic Gradient Descent (SGD) is very useful in optimization problems with high-dimensional non-...
Stochastic gradient descent (SGD) and its variants have become more and more popular in machine lear...
In machine learning research, many emerging applications can be (re)formulated as the composition op...
Abstract We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strong...
This thesis presents a parallel algorithm for non-convex large-scale stochastic optimization problem...
International audienceWe describe several features of parallel or distributed asynchronous iterative...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a ...
We develop stochastic variants of the wellknown BFGS quasi-Newton optimization method, in both full ...
Stochastic gradient descent (SGD) and its variants have attracted much attention in machine learning...
Nowadays, asynchronous parallel algorithms have received much attention in the optimization field du...
We provide the first theoretical analysis on the convergence rate of asynchronous mini-batch gradie...
In this paper, a stochastic quasi-Newton algorithm for nonconvex stochastic optimization is presente...
Stochastic Gradient Descent (SGD) is very useful in optimization problems with high-dimensional non-...
Stochastic gradient descent (SGD) and its variants have become more and more popular in machine lear...
In machine learning research, many emerging applications can be (re)formulated as the composition op...
Abstract We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strong...
This thesis presents a parallel algorithm for non-convex large-scale stochastic optimization problem...
International audienceWe describe several features of parallel or distributed asynchronous iterative...