It is well-known that second-order optimizer can accelerate the training of deep neural networks, however, the huge computation cost of second-order optimization makes it impractical to apply in real practice. In order to reduce the cost, many methods have been proposed to approximate a second-order matrix. Inspired by KFAC, we propose a novel Trace-based Hardware-driven layer-ORiented Natural Gradient Descent Computation method, called THOR, to make the second-order optimization applicable in the real application models. Specifically, we gradually increase the update interval and use the matrix trace to determine which blocks of Fisher Information Matrix (FIM) need to be updated. Moreover, by resorting the power of hardware, we have design...
International audienceWe design four novel approximations of the Fisher Information Matrix (FIM) tha...
Deep neural networks currently play a prominent role in solving problems across a wide variety of di...
Second-order optimizers are thought to hold the potential to speed up neural network training, but d...
Second-order optimization methods have the ability to accelerate convergence by modifying the gradie...
In this dissertation, we are concerned with the advancement of optimization algorithms for training ...
We propose a fast second-order method that can be used as a drop-in replacement for current deep lea...
Second-order optimization methods applied to train deep neural net- works use the curvature informat...
Neural networks are an important class of highly flexible and powerful models inspired by the struct...
Training deep neural networks consumes increasing computational resource shares in many compute cent...
© 2019 IEEE. Training deep convolutional neural networks such as VGG and ResNet by gradient descent ...
Several studies have shown the ability of natural gradient descent to minimize the objective functio...
This research was partially supported by the Italian MURST. A new second order algorithm based on Sc...
Neural network training algorithms have always suffered from the problem of local minima. The advent...
Current training methods for deep neural networks boil down to very high dimensional and non-convex ...
This paper proposes the Mesh Neural Network (MNN), a novel architecture which allows neurons to be c...
International audienceWe design four novel approximations of the Fisher Information Matrix (FIM) tha...
Deep neural networks currently play a prominent role in solving problems across a wide variety of di...
Second-order optimizers are thought to hold the potential to speed up neural network training, but d...
Second-order optimization methods have the ability to accelerate convergence by modifying the gradie...
In this dissertation, we are concerned with the advancement of optimization algorithms for training ...
We propose a fast second-order method that can be used as a drop-in replacement for current deep lea...
Second-order optimization methods applied to train deep neural net- works use the curvature informat...
Neural networks are an important class of highly flexible and powerful models inspired by the struct...
Training deep neural networks consumes increasing computational resource shares in many compute cent...
© 2019 IEEE. Training deep convolutional neural networks such as VGG and ResNet by gradient descent ...
Several studies have shown the ability of natural gradient descent to minimize the objective functio...
This research was partially supported by the Italian MURST. A new second order algorithm based on Sc...
Neural network training algorithms have always suffered from the problem of local minima. The advent...
Current training methods for deep neural networks boil down to very high dimensional and non-convex ...
This paper proposes the Mesh Neural Network (MNN), a novel architecture which allows neurons to be c...
International audienceWe design four novel approximations of the Fisher Information Matrix (FIM) tha...
Deep neural networks currently play a prominent role in solving problems across a wide variety of di...
Second-order optimizers are thought to hold the potential to speed up neural network training, but d...