A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule
We study the effect of regularization in an on-line gradient-descent learning scenario for a general...
An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and co...
One of the fundamental limitations of artificial neural network learning by gradient descent is the ...
A method for calculating the globally optimal learning rate in on-line gradient-descent training of ...
We present a framework for calculating globally optimal parameters, within a given time frame, for o...
We present a method for determining the globally optimal on-line learning rule for a soft committee ...
We present an analytic solution to the problem of on-line gradient-descent learning for two-layer ne...
In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning...
We present a global algorithm for training multilayer neural networks in this Letter. The algorithm ...
A fast algorithm is proposed for optimal supervised learning in multiple-layer neural networks. The ...
Natural gradient descent (NGD) is an on-line algorithm for redefining the steepest descent direction...
We study on-line gradient-descent learning in multilayer networks analytically and numerically. The ...
: This paper describes two algorithms based on cooperative evolution of internal hidden network repr...
. We present a method for determining the globally optimal on-line learning rule for a soft committe...
We prove that two-layer (Leaky)ReLU networks with one-dimensional input and output trained using gra...
We study the effect of regularization in an on-line gradient-descent learning scenario for a general...
An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and co...
One of the fundamental limitations of artificial neural network learning by gradient descent is the ...
A method for calculating the globally optimal learning rate in on-line gradient-descent training of ...
We present a framework for calculating globally optimal parameters, within a given time frame, for o...
We present a method for determining the globally optimal on-line learning rule for a soft committee ...
We present an analytic solution to the problem of on-line gradient-descent learning for two-layer ne...
In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning...
We present a global algorithm for training multilayer neural networks in this Letter. The algorithm ...
A fast algorithm is proposed for optimal supervised learning in multiple-layer neural networks. The ...
Natural gradient descent (NGD) is an on-line algorithm for redefining the steepest descent direction...
We study on-line gradient-descent learning in multilayer networks analytically and numerically. The ...
: This paper describes two algorithms based on cooperative evolution of internal hidden network repr...
. We present a method for determining the globally optimal on-line learning rule for a soft committe...
We prove that two-layer (Leaky)ReLU networks with one-dimensional input and output trained using gra...
We study the effect of regularization in an on-line gradient-descent learning scenario for a general...
An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and co...
One of the fundamental limitations of artificial neural network learning by gradient descent is the ...