Deep Learning learning has recently become one of the most predominantly used techniques in the field of Machine Learning. Optimising these models, however, is very difficult and in order to scale the training to large datasets and model sizes practitioners use first-order optimisation methods. One of the main challenges of using the more sophisticated second-order optimisation methods is that the curvature matrices of the loss surfaces of neural networks are usually intractable, which is an open avenue for research. In this work, we investigate the Gauss-Newton matrix for neural networks and its application in different areas of Machine Learning. Firstly, we analyse the structure of the Hessian and Gauss-Newton matrices for Feed Forward Ne...
Uncertainty estimates are crucial in many deep learning problems, e.g. for active learning or safety...
The Levenberg-Marquardt (LM) learning algorithm is a popular algorithm for training neural networks;...
Second-order optimization methods applied to train deep neural net- works use the curvature informat...
Deep Learning learning has recently become one of the most predominantly used techniques in the fiel...
We present an efficient block-diagonal approximation to the Gauss-Newton matrix for feedforward neur...
In this dissertation, we are concerned with the advancement of optimization algorithms for training ...
We design four novel approximations of the Fisher Information Matrix (FIM) that plays a central role...
For a long time, second-order optimization methods have been regarded as computationally inefficient...
Neural networks are an important class of highly flexible and powerful models inspired by the struct...
Second-order optimization methods have the ability to accelerate convergence by modifying the gradie...
The current scalable Bayesian methods for Deep Neural Networks (DNNs) often rely on the Fisher Infor...
The stochastic gradient method is currently the prevailing technology for training neural networks. ...
While first-order methods are popular for solving optimization problems that arise in large-scale de...
We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer...
We introduce the Kronecker factored online Laplace approximation for overcoming catastrophic forget...
Uncertainty estimates are crucial in many deep learning problems, e.g. for active learning or safety...
The Levenberg-Marquardt (LM) learning algorithm is a popular algorithm for training neural networks;...
Second-order optimization methods applied to train deep neural net- works use the curvature informat...
Deep Learning learning has recently become one of the most predominantly used techniques in the fiel...
We present an efficient block-diagonal approximation to the Gauss-Newton matrix for feedforward neur...
In this dissertation, we are concerned with the advancement of optimization algorithms for training ...
We design four novel approximations of the Fisher Information Matrix (FIM) that plays a central role...
For a long time, second-order optimization methods have been regarded as computationally inefficient...
Neural networks are an important class of highly flexible and powerful models inspired by the struct...
Second-order optimization methods have the ability to accelerate convergence by modifying the gradie...
The current scalable Bayesian methods for Deep Neural Networks (DNNs) often rely on the Fisher Infor...
The stochastic gradient method is currently the prevailing technology for training neural networks. ...
While first-order methods are popular for solving optimization problems that arise in large-scale de...
We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer...
We introduce the Kronecker factored online Laplace approximation for overcoming catastrophic forget...
Uncertainty estimates are crucial in many deep learning problems, e.g. for active learning or safety...
The Levenberg-Marquardt (LM) learning algorithm is a popular algorithm for training neural networks;...
Second-order optimization methods applied to train deep neural net- works use the curvature informat...