A Limitation of Gradient Descent Learning

John Sum
Chi-Sing Leung
Kevin Ho

Publication date

December 2022

Abstract

Over decades, gradient descent has been applied to develop learning algorithm to train a neural network (NN). In this brief, a limitation of applying such algorithm to train an NN with persistent weight noise is revealed. Let V(w) be the performance measure of an ideal NN. V(w) is applied to develop the gradient descent learning (GDL). With weight noise, the desired performance measure (denoted as J(w) ) is E[V(~w)|w] , where ~w is the noisy weight vector. Applying GDL to train an NN with weight noise, the actual learning objective is clearly not V(w) but another scalar function L(w) . For decades, there is a misconception that L(w) = J(w) , and hence, the actual model attained by the GDL is the desired model. However, we show that it might...

Extracted data

We use cookies to provide a better user experience.

Data Protection

A Limitation of Gradient Descent Learning

Abstract

Extracted data

A Limitation of Gradient Descent Learning

Abstract

Extracted data

Related items

Related items