Adaptive scaling of the learning rate by second order automatic differentiation

de Gournay, Frédéric
Gossard, Alban

Publication date

October 2022

Publisher

HAL CCSD

Abstract

In the context of the optimization of Deep Neural Networks, we propose to rescale the learning rate using a new technique of automatic differentiation. This technique relies on the computation of the {\em curvature}, a second order information whose computational complexity is in between the computation of the gradient and the one of the Hessian-vector product. If (1C,1M) represents respectively the computational time and memory footprint of the gradient method, the new technique increase the overall cost to either (1.5C,2M) or (2C,1M). This rescaling has the appealing characteristic of having a natural interpretation, it allows the practitioner to choose between exploration of the parameters set and convergence of the algorithm. The resca...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Adaptive scaling of the learning rate by second order automatic differentiation

Abstract

Extracted data

Adaptive scaling of the learning rate by second order automatic differentiation

Abstract

Extracted data

Related items

Related items