Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

Gower, Robert M.
Blondel, Mathieu
Gazagnadou, Nidham
Pedregosa, Fabian

Publication date

May 2022

Abstract

Tuning the step size of stochastic gradient descent is tedious and error prone. This has motivated the development of methods that automatically adapt the step size using readily available information. In this paper, we consider the family of SPS (Stochastic gradient with a Polyak Stepsize) adaptive methods. These are methods that make use of gradient and loss value at the sampled points to adaptively adjust the step size. We first show that SPS and its recent variants can all be seen as extensions of the Passive-Aggressive methods applied to nonlinear problems. We use this insight to develop new variants of the SPS method that are better suited to nonlinear models. Our new variants are based on introducing a slack variable into the interpo...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

Abstract

Extracted data

Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

Abstract

Extracted data

Related items

Related items