A Trace-restricted Kronecker-Factored Approximation to Natural Gradient

Flatten the Curve: Efficiently Training Low-Curvature Neural Networks

Srinivas, Suraj
Matoba, Kyle
Lakkaraju, Himabindu
Fleuret, Francois

June 2022

The highly non-linear nature of deep neural networks causes them to be susceptible to adversarial ex...

Adaptive scaling of the learning rate by second order automatic differentiation

de Gournay, Frédéric
Gossard, Alban

October 2022

In the context of the optimization of Deep Neural Networks, we propose to rescale the learning rate ...

EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks

Chen, Sheng-Wei
Chou, Chun-Nan
Chang, Edward Y.

July 2019

For training fully-connected neural networks (FCNNs), we propose a practical approximate second-orde...

Low-rank Kronecker-factored Approximate Curvature for Training Deep Neural Networks

Dubach, Till

July 2022

Second-order optimization methods applied to train deep neural net- works use the curvature informat...

Kronecker-Factored Optimal Curvature

Schnaus, Dominik
Lee, Jongseok
Triebel, Rudolph

December 2021

The current scalable Bayesian methods for Deep Neural Networks (DNNs) often rely on the Fisher Infor...

Using Second-Order Information in Training Deep Neural Networks

Ren, Yi

January 2022

In this dissertation, we are concerned with the advancement of optimization algorithms for training ...

Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition

Koroko, Abdoulaye
Anciaux-Sedrakian, Ani
Gharbia, Ibtihel,
Garès, Valérie
Haddou, Mounir
Tran, Quang Huy

October 2022

Several studies have shown the ability of natural gradient descent to minimize the objective functio...

Efficient approximations of the fisher matrix in neural networks using kronecker product singular value decomposition

Abdoulaye Koroko
Ani Anciaux-Sedrakian
Ibtihel Ben Gharbia
Valérie Garès
Mounir Haddou
Quang Huy Tran

August 2023

We design four novel approximations of the Fisher Information Matrix (FIM) that plays a central role...

Second-order Optimization for Neural Networks

Martens, James

March 2016

Neural networks are an important class of highly flexible and powerful models inspired by the struct...

Méthodes d'optimisation basées sur le gradient naturel pour les réseaux de neurones profonds

Koroko, Abdoulaye

October 2023

The stochastic gradient method is currently the prevailing technology for training neural networks. ...

Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization

Benzing, Frederik

June 2022

Second-order optimizers are thought to hold the potential to speed up neural network training, but d...

The Gauss-Newton matrix for Deep Learning models and its applications

Botev, Aleksandar

December 2020

Deep Learning learning has recently become one of the most predominantly used techniques in the fiel...

THOR, Trace-based Hardware-driven Layer-Oriented Natural Gradient Descent Computation

Chen, Mengyun
Gao, Kaixin
Liu, Xiaolei
Wang, Zidong
Ni, Ningxi
Zhang, Qian
Chen, Lei
Ding, Chao
Huang, Zhenghai
Wang, Min
Wang, Shuangling
Yu, Fan
Zhao, Xinyuan
Xu, Dachuan

May 2021

It is well-known that second-order optimizer can accelerate the training of deep neural networks, ho...

Utvärdering av användningen av en kroneckerfaktoriserad approximativ krökningsmatris i Newton's metod för optimering i neurala nätverk

Tornstad, Magnus

January 2020

For a long time, second-order optimization methods have been regarded as computationally inefficient...

M-FAC: Efficient matrix-free approximations of second-order information

Frantar, Elias
Kurtic, Eldar
Alistarh, Dan-Adrian

January 2021

Efficiently approximating local curvature information of the loss function is a key tool for optimiz...

Flatten the Curve: Efficiently Training Low-Curvature Neural Networks

Srinivas, Suraj
Matoba, Kyle
Lakkaraju, Himabindu
Fleuret, Francois

June 2022

The highly non-linear nature of deep neural networks causes them to be susceptible to adversarial ex...

Adaptive scaling of the learning rate by second order automatic differentiation

de Gournay, Frédéric
Gossard, Alban

October 2022

In the context of the optimization of Deep Neural Networks, we propose to rescale the learning rate ...

EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks

Chen, Sheng-Wei
Chou, Chun-Nan
Chang, Edward Y.

July 2019

For training fully-connected neural networks (FCNNs), we propose a practical approximate second-orde...

Low-rank Kronecker-factored Approximate Curvature for Training Deep Neural Networks

Dubach, Till

July 2022

Second-order optimization methods applied to train deep neural net- works use the curvature informat...

Kronecker-Factored Optimal Curvature

Schnaus, Dominik
Lee, Jongseok
Triebel, Rudolph

December 2021

The current scalable Bayesian methods for Deep Neural Networks (DNNs) often rely on the Fisher Infor...

Using Second-Order Information in Training Deep Neural Networks

Ren, Yi

January 2022

In this dissertation, we are concerned with the advancement of optimization algorithms for training ...

Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition

Koroko, Abdoulaye
Anciaux-Sedrakian, Ani
Gharbia, Ibtihel,
Garès, Valérie
Haddou, Mounir
Tran, Quang Huy

October 2022

Several studies have shown the ability of natural gradient descent to minimize the objective functio...

Efficient approximations of the fisher matrix in neural networks using kronecker product singular value decomposition

Abdoulaye Koroko
Ani Anciaux-Sedrakian
Ibtihel Ben Gharbia
Valérie Garès
Mounir Haddou
Quang Huy Tran

August 2023

We design four novel approximations of the Fisher Information Matrix (FIM) that plays a central role...

Second-order Optimization for Neural Networks

Martens, James

March 2016

Neural networks are an important class of highly flexible and powerful models inspired by the struct...

Méthodes d'optimisation basées sur le gradient naturel pour les réseaux de neurones profonds

Koroko, Abdoulaye

October 2023

The stochastic gradient method is currently the prevailing technology for training neural networks. ...

Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization

Benzing, Frederik

June 2022

Second-order optimizers are thought to hold the potential to speed up neural network training, but d...

The Gauss-Newton matrix for Deep Learning models and its applications

Botev, Aleksandar

December 2020

Deep Learning learning has recently become one of the most predominantly used techniques in the fiel...

THOR, Trace-based Hardware-driven Layer-Oriented Natural Gradient Descent Computation

Chen, Mengyun
Gao, Kaixin
Liu, Xiaolei
Wang, Zidong
Ni, Ningxi
Zhang, Qian
Chen, Lei
Ding, Chao
Huang, Zhenghai
Wang, Min
Wang, Shuangling
Yu, Fan
Zhao, Xinyuan
Xu, Dachuan

May 2021

It is well-known that second-order optimizer can accelerate the training of deep neural networks, ho...

Utvärdering av användningen av en kroneckerfaktoriserad approximativ krökningsmatris i Newton's metod för optimering i neurala nätverk

Tornstad, Magnus

January 2020

For a long time, second-order optimization methods have been regarded as computationally inefficient...

M-FAC: Efficient matrix-free approximations of second-order information

Frantar, Elias
Kurtic, Eldar
Alistarh, Dan-Adrian

January 2021

Efficiently approximating local curvature information of the loss function is a key tool for optimiz...

Flatten the Curve: Efficiently Training Low-Curvature Neural Networks

Srinivas, Suraj
Matoba, Kyle
Lakkaraju, Himabindu
Fleuret, Francois

June 2022

The highly non-linear nature of deep neural networks causes them to be susceptible to adversarial ex...

Adaptive scaling of the learning rate by second order automatic differentiation

de Gournay, Frédéric
Gossard, Alban

October 2022

In the context of the optimization of Deep Neural Networks, we propose to rescale the learning rate ...

EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks

Chen, Sheng-Wei
Chou, Chun-Nan
Chang, Edward Y.

July 2019

For training fully-connected neural networks (FCNNs), we propose a practical approximate second-orde...

A Trace-restricted Kronecker-Factored Approximation to Natural Gradient

Abstract

Extracted data

A Trace-restricted Kronecker-Factored Approximation to Natural Gradient

Abstract

Extracted data

Related items

Related items