Abstract. Recently, we proposed to transform the outputs of each hidden neu-ron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by ana-lyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learnin...
This report contains some remarks about the backpropagation method for neural net learning. We conce...
Rapid advances in data collection and processing capabilities have allowed for the use of increasing...
An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and co...
Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron net...
Neural networks are an important class of highly flexible and powerful models inspired by the struct...
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based o...
This paper proposes an improved stochastic second order learning algorithm for supervised neural net...
In this paper we explore different strategies to guide backpropagation algorithm used for training a...
Understanding intelligence and how it allows humans to learn, to make decision and form memories, is...
In this dissertation, we are concerned with the advancement of optimization algorithms for training ...
When a parameter space has a certain underlying structure, the ordinary gradient of a function does ...
Supervised Learning in Multi-Layered Neural Networks (MLNs) has been recently proposed through the w...
Methods to speed up learning in back propagation and to optimize the network architecture have been ...
Abstract We present an emcl analysis of ieaming a rule by on-line gradient descent in a two-layered ...
Many connectionist learning algorithms consists of minimizing a cost of the form C(w) = E(J(z; w)) ...
This report contains some remarks about the backpropagation method for neural net learning. We conce...
Rapid advances in data collection and processing capabilities have allowed for the use of increasing...
An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and co...
Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron net...
Neural networks are an important class of highly flexible and powerful models inspired by the struct...
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based o...
This paper proposes an improved stochastic second order learning algorithm for supervised neural net...
In this paper we explore different strategies to guide backpropagation algorithm used for training a...
Understanding intelligence and how it allows humans to learn, to make decision and form memories, is...
In this dissertation, we are concerned with the advancement of optimization algorithms for training ...
When a parameter space has a certain underlying structure, the ordinary gradient of a function does ...
Supervised Learning in Multi-Layered Neural Networks (MLNs) has been recently proposed through the w...
Methods to speed up learning in back propagation and to optimize the network architecture have been ...
Abstract We present an emcl analysis of ieaming a rule by on-line gradient descent in a two-layered ...
Many connectionist learning algorithms consists of minimizing a cost of the form C(w) = E(J(z; w)) ...
This report contains some remarks about the backpropagation method for neural net learning. We conce...
Rapid advances in data collection and processing capabilities have allowed for the use of increasing...
An adaptive back-propagation algorithm parameterized by an inverse temperature 1/T is studied and co...