This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models....
Deep Gaussian Process (DGP) as a model prior in Bayesian learning intuitively exploits the expressiv...
We investigate the asymptotic properties of deep Residual networks (ResNets) as the number of layers...
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, give...
The basic structure and definitions of artificial neural networks are exposed, as an introduction to...
University of Technology Sydney. Faculty of Engineering and Information Technology.Deep learning has...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
These lectures, presented at the 2022 Les Houches Summer School on Statistical Physics and Machine L...
In this paper, we consider the generalization ability of deep wide feedforward ReLU neural networks ...
Single-index models are a class of functions given by an unknown univariate ``link'' function applie...
An underlying mechanism for successful deep learning (DL) with a limited deep architecture and datas...
Artificial Neural Networks (ANNs) are complex modelling techniques that can be used to find the rela...
In the recent years, Deep Neural Networks (DNNs) have managed to succeed at tasks that previously ap...
Recently proposed deep learning systems can achieve superior performance with respect to methods bas...
This article provides a comprehensive understanding of optimization in deep learning, with a primary...
We analyze feature learning in infinite-width neural networks trained with gradient flow through a s...
Deep Gaussian Process (DGP) as a model prior in Bayesian learning intuitively exploits the expressiv...
We investigate the asymptotic properties of deep Residual networks (ResNets) as the number of layers...
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, give...
The basic structure and definitions of artificial neural networks are exposed, as an introduction to...
University of Technology Sydney. Faculty of Engineering and Information Technology.Deep learning has...
A main puzzle of deep networks revolves around the absence of overfitting despite overparametrizatio...
These lectures, presented at the 2022 Les Houches Summer School on Statistical Physics and Machine L...
In this paper, we consider the generalization ability of deep wide feedforward ReLU neural networks ...
Single-index models are a class of functions given by an unknown univariate ``link'' function applie...
An underlying mechanism for successful deep learning (DL) with a limited deep architecture and datas...
Artificial Neural Networks (ANNs) are complex modelling techniques that can be used to find the rela...
In the recent years, Deep Neural Networks (DNNs) have managed to succeed at tasks that previously ap...
Recently proposed deep learning systems can achieve superior performance with respect to methods bas...
This article provides a comprehensive understanding of optimization in deep learning, with a primary...
We analyze feature learning in infinite-width neural networks trained with gradient flow through a s...
Deep Gaussian Process (DGP) as a model prior in Bayesian learning intuitively exploits the expressiv...
We investigate the asymptotic properties of deep Residual networks (ResNets) as the number of layers...
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, give...