This paper describes the principle of "General Cyclical Training" in machine learning, where training starts and ends with "easy training" and the "hard training" happens during the middle epochs. We propose several manifestations for training neural networks, including algorithmic examples (via hyper-parameters and loss functions), data-based examples, and model-based examples. Specifically, we introduce several novel techniques: cyclical weight decay, cyclical batch size, cyclical focal loss, cyclical softmax temperature, cyclical data augmentation, cyclical gradient clipping, and cyclical semi-supervised learning. In addition, we demonstrate that cyclical weight decay, cyclical softmax temperature, and cyclical gradient clipping (as thre...
Existing metrics for the learning performance of feed-forward neural networks do not provide a satis...
Attractor properties of a popular discrete-time neural network model are illustrated through numeric...
This thesis presents a new theory of generalization in neural network types of learning machines. Th...
The paper first summarizes a general approach to the training of recurrent neural networks by gradie...
Abstract — Two backpropagation algorithms with momentum for feedforward neural networks with a singl...
Learning curves show how a neural network is improved as the number of training examples increases a...
The importance of the problem of designing learning machines rests on the promise of one day deliver...
There are many types of activity which are commonly known as ‘learning’. Here, we shall discuss a ma...
Constructive algorithms have proved to be powerful methods for training feedforward neural networks....
This thesis is divided into two parts: the first examines various extensions to Cascade-Correlation,...
The cross-entropy softmax loss is the primary loss function used to train deep neural networks. On t...
In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks ...
The paper presented exposes a novel approach to feed data to a Convolutional Neural Network (CNN) wh...
Neural network modeling typically ignores the role of knowledge in learning by starting from random ...
It is often difficult to predict the optimal neural network size for a particular application. Const...
Existing metrics for the learning performance of feed-forward neural networks do not provide a satis...
Attractor properties of a popular discrete-time neural network model are illustrated through numeric...
This thesis presents a new theory of generalization in neural network types of learning machines. Th...
The paper first summarizes a general approach to the training of recurrent neural networks by gradie...
Abstract — Two backpropagation algorithms with momentum for feedforward neural networks with a singl...
Learning curves show how a neural network is improved as the number of training examples increases a...
The importance of the problem of designing learning machines rests on the promise of one day deliver...
There are many types of activity which are commonly known as ‘learning’. Here, we shall discuss a ma...
Constructive algorithms have proved to be powerful methods for training feedforward neural networks....
This thesis is divided into two parts: the first examines various extensions to Cascade-Correlation,...
The cross-entropy softmax loss is the primary loss function used to train deep neural networks. On t...
In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks ...
The paper presented exposes a novel approach to feed data to a Convolutional Neural Network (CNN) wh...
Neural network modeling typically ignores the role of knowledge in learning by starting from random ...
It is often difficult to predict the optimal neural network size for a particular application. Const...
Existing metrics for the learning performance of feed-forward neural networks do not provide a satis...
Attractor properties of a popular discrete-time neural network model are illustrated through numeric...
This thesis presents a new theory of generalization in neural network types of learning machines. Th...