The lingering of gradients: How to reuse gradients over time

Allen-Zhu, Zeyuan
Simchi-Levi, David
Wang, Xinshang

Open link

Publication date

June 2020

Publisher

Morgan Kaufmann Publishers

Journal

issn:1049-5258

Abstract

© 2018 Curran Associates Inc..All rights reserved. Classically, the time complexity of a first-order method is estimated by its number of gradient computations. In this paper, we study a more refined complexity by taking into account the “lingering” of gradients: once a gradient is computed at xk, the additional time to compute gradients at xk+1, xk+2, . . . may be reduced. We show how this improves the running time of gradient descent and SVRG. For instance, if the “additional time” scales linearly with respect to the traveled distance, then the “convergence rate” of gradient descent can be improved from 1/T to exp(−T1/3). On the empirical side, we solve a hypothetical revenue management problem on the Yahoo! Front Page Today Module applic...

Extracted data

We use cookies to provide a better user experience.

Data Protection

The lingering of gradients: How to reuse gradients over time

Abstract

Extracted data

The lingering of gradients: How to reuse gradients over time

Abstract

Extracted data

Related items

Related items