We analyze new online gradient descent algorithms for distributed systems with large delays between gradient computations and the corresponding updates. Us-ing insights from adaptive gradient methods, we develop algorithms that adapt not only to the sequence of gradients, but also to the precise update delays that occur. We first give an impractical algorithm that achieves a regret bound that precisely quantifies the impact of the delays. We then analyze AdaptiveRevision, an algorithm that is efficiently implementable and achieves comparable guarantees. The key algorithmic technique is appropriately and efficiently revising the learn-ing rate used for previous gradient steps. Experimental results show when the delays grow large (1000 update...
Online learning algorithms have impressive convergence properties when it comes to risk minimization...
With the recent proliferation of large-scale learning problems, there have been a lot of interest o...
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the o...
We analyze new online gradient descent algorithms for distributed systems with large delays between ...
International audienceDistributed learning aims at computing high-quality models by training over sc...
International audienceDistributed learning aims at computing high-quality models by training over sc...
In large-scale optimization problems, distributed asynchronous stochastic gradient descent (DASGD) i...
In large-scale optimization problems, distributed asynchronous stochastic gradient descent (DASGD) i...
We present a unified, black-box-style method for developing and analyzing online convex optimization...
We present a unified, black-box-style method for developing and analyzing online convex optimization...
We develop and analyze an asynchronous algorithm for distributed convex optimization when the object...
International audienceOne of the most widely used methods for solving large-scale stochastic optimiz...
International audienceOne of the most widely used methods for solving large-scale stochastic optimiz...
International audienceIn this paper, we provide a general framework for studying multi-agent online ...
International audienceIn this paper, we provide a general framework for studying multi-agent online ...
Online learning algorithms have impressive convergence properties when it comes to risk minimization...
With the recent proliferation of large-scale learning problems, there have been a lot of interest o...
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the o...
We analyze new online gradient descent algorithms for distributed systems with large delays between ...
International audienceDistributed learning aims at computing high-quality models by training over sc...
International audienceDistributed learning aims at computing high-quality models by training over sc...
In large-scale optimization problems, distributed asynchronous stochastic gradient descent (DASGD) i...
In large-scale optimization problems, distributed asynchronous stochastic gradient descent (DASGD) i...
We present a unified, black-box-style method for developing and analyzing online convex optimization...
We present a unified, black-box-style method for developing and analyzing online convex optimization...
We develop and analyze an asynchronous algorithm for distributed convex optimization when the object...
International audienceOne of the most widely used methods for solving large-scale stochastic optimiz...
International audienceOne of the most widely used methods for solving large-scale stochastic optimiz...
International audienceIn this paper, we provide a general framework for studying multi-agent online ...
International audienceIn this paper, we provide a general framework for studying multi-agent online ...
Online learning algorithms have impressive convergence properties when it comes to risk minimization...
With the recent proliferation of large-scale learning problems, there have been a lot of interest o...
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the o...