Learning curves for stochastic gradient descent in linear feedforward networks

Justin Werfel
Xiaohui Xie
H. Sebastian Seung

Open link

Publication date

January 2005

DOI

10.1162/089976605774320539

ISSN

0899-7667

Citation count (estimate)

Abstract

Gradient-following learning methods can encounter problems of imple-mentation in many applications, and stochastic variants are frequently used to overcome these difficulties. We derive quantitative learning curves for three online training methods used with a linear perceptron: direct gradient descent, node perturbation, and weight perturbation. The maximum learning rate for the stochastic methods scales inversely with the first power of the dimensionality of the noise injected into the sys-tem; with sufficiently small learning rate, all three methods give iden-tical learning curves. These results suggest guidelines for when these stochastic methods will be limited in their utility, and considerations for architectures in which they will b...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Learning curves for stochastic gradient descent in linear feedforward networks

Abstract

Extracted data

Learning curves for stochastic gradient descent in linear feedforward networks

Abstract

Extracted data

Related items

Related items