ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

Yao, Zhewei
Gholami, Amir
Shen, Sheng
Mustafa, Mustafa
Keutzer, Kurt
Mahoney, Michael

Open link

Publication date

May 2021

DOI

10.1609/aaai.v35i12.17275

Publisher

Association for the Advancement of Artificial Intelligence

Abstract

Incorporating second-order curvature information into machine learning optimization algorithms can be subtle, and doing so naïvely can lead to high per-iteration costs associated with forming the Hessian and performing the associated linear system solve. To address this, we introduce ADAHESSIAN, a new stochastic optimization algorithm. ADAHESSIAN directly incorporates approximate curvature information from the loss function, and it includes several novel performance-improving features, including: (i) a fast Hutchinson based method to approximate the curvature matrix with low computational overhead; (ii) a spatial averaging to reduce the variance of the second derivative; and (iii) a root-mean-square exponential moving average to smooth out ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

Abstract

Extracted data

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

Abstract

Extracted data

Related items

Related items