Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs

Reiz, Severin
Neckel, Tobias
Bungartz, Hans-Joachim

Publication date

August 2022

Language

English

Abstract

Training deep neural networks consumes increasing computational resource shares in many compute centers. Often, a brute force approach to obtain hyperparameter values is employed. Our goal is (1) to enhance this by enabling second-order optimization methods with fewer hyperparameters for large-scale neural networks and (2) to perform a survey of the performance optimizers for specific tasks to suggest users the best one for their problem. We introduce a novel second-order optimization method that requires the effect of the Hessian on a vector only and avoids the huge cost of explicitly setting up the Hessian for large-scale networks. We compare the proposed second-order method with two state-of-the-art optimizers on five representative ne...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs

Abstract

Extracted data

Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs

Abstract

Extracted data

Related items

Related items