Neural network models for dynamical systems have been subject of considerable interest lately. They are often characterized by the fact that they use a fairly large amount of parameters. Here we address the problem why this can be done without the usual penalty in terms of a large variance error. We show that reguralization is a key explanation, and that terminating a gradient search ("backpropagation") before the true criterion minimum is found is a way of achieving regularization. This, among other things, also explains the concept of "overtraining" in neural nets
In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks ...
We study the training dynamics of a shallow neural network with quadratic activation functions and q...
Regularizing the gradient norm of the output of a neural network is a powerful technique, rediscover...
In this paper we discuss the role of criterion minimization as a means for parameter estimation. Mos...
In this paper we discuss the role of criterion minimization as a means for parameter estimation. Mos...
Abstract. In this paper we address the important problem of optimizing regularization parameters in ...
Neural networks are more expressive when they have multiple layers. In turn, conventional training m...
Numerous approaches address over-fitting in neural networks: by imposing a penalty on the parameters...
For many reasons, neural networks have become very popular AI machine learning models. Two of the mo...
Unsupervised neural networks, such as restricted Boltzmann machines (RBMs) and deep belief networks ...
Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization...
The remarkable practical success of deep learning has revealed some major surprises from a theoretic...
. In this paper we study how global optimization methods (like genetic algorithms) can be used to tr...
Abstract We present weight normalization: a reparameterization of the weight vectors in a neural net...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks ...
We study the training dynamics of a shallow neural network with quadratic activation functions and q...
Regularizing the gradient norm of the output of a neural network is a powerful technique, rediscover...
In this paper we discuss the role of criterion minimization as a means for parameter estimation. Mos...
In this paper we discuss the role of criterion minimization as a means for parameter estimation. Mos...
Abstract. In this paper we address the important problem of optimizing regularization parameters in ...
Neural networks are more expressive when they have multiple layers. In turn, conventional training m...
Numerous approaches address over-fitting in neural networks: by imposing a penalty on the parameters...
For many reasons, neural networks have become very popular AI machine learning models. Two of the mo...
Unsupervised neural networks, such as restricted Boltzmann machines (RBMs) and deep belief networks ...
Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization...
The remarkable practical success of deep learning has revealed some major surprises from a theoretic...
. In this paper we study how global optimization methods (like genetic algorithms) can be used to tr...
Abstract We present weight normalization: a reparameterization of the weight vectors in a neural net...
Injecting noise within gradient descent has several desirable features. In this paper, we explore no...
In this chapter, we describe the basic concepts behind the functioning of recurrent neural networks ...
We study the training dynamics of a shallow neural network with quadratic activation functions and q...
Regularizing the gradient norm of the output of a neural network is a powerful technique, rediscover...