Learnable weight initialization in neural networks

Bhattacharya, A. (author)

Publication date

August 2020

Abstract

A new method of initializing the weights in deep neural networks is proposed. The method follows two steps. First, consider each layer as a model and perform a linear regression to keep the mean of the layer output to zero and varianceafter the data is passed through the activation function to one. Once each layer converges to the target mean and variance, initialize the weights of the original model with the learned weights. Performance is evaluated on LeNet and ResNet18 architectures on FashionMNIST and Imagenette datasets. The activation functions used to analyze the performance are sigmoid, tanh and ReLU. Findings show that the learned weights can perform similarly, and for certain scenarios, better than the different types of weight in...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Learnable weight initialization in neural networks

Abstract

Extracted data

Learnable weight initialization in neural networks

Abstract

Extracted data

Related items

Related items