THOR, Trace-based Hardware-driven Layer-Oriented Natural Gradient Descent Computation

Chen, Mengyun
Gao, Kaixin
Liu, Xiaolei
Wang, Zidong
Ni, Ningxi
Zhang, Qian
Chen, Lei
Ding, Chao
Huang, Zhenghai
Wang, Min
Wang, Shuangling
Yu, Fan
Zhao, Xinyuan
Xu, Dachuan

Open link

Publication date

May 2021

DOI

10.1609/aaai.v35i8.16867

Publisher

Association for the Advancement of Artificial Intelligence

Abstract

It is well-known that second-order optimizer can accelerate the training of deep neural networks, however, the huge computation cost of second-order optimization makes it impractical to apply in real practice. In order to reduce the cost, many methods have been proposed to approximate a second-order matrix. Inspired by KFAC, we propose a novel Trace-based Hardware-driven layer-ORiented Natural Gradient Descent Computation method, called THOR, to make the second-order optimization applicable in the real application models. Specifically, we gradually increase the update interval and use the matrix trace to determine which blocks of Fisher Information Matrix (FIM) need to be updated. Moreover, by resorting the power of hardware, we have design...

Extracted data

We use cookies to provide a better user experience.

Data Protection

THOR, Trace-based Hardware-driven Layer-Oriented Natural Gradient Descent Computation

Abstract

Extracted data

THOR, Trace-based Hardware-driven Layer-Oriented Natural Gradient Descent Computation

Abstract

Extracted data

Related items

Related items