The limit of infinite width allows for substantial simplifications in the analytical study of overparameterized neural networks. With a suitable random initialization, an extremely large network is well approximated by a Gaussian process, both before and during training. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimizes the generalization bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC-Bayesian methods.Comment: 20 pages, 2 figure
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (G...
We develop a mathematically rigorous framework for multilayer neural networks in the mean field regi...
This thesis aims to study recent theoretical work in machine learning research that seeks to better ...
Recent studies have empirically investigated different methods to train stochastic neural networks o...
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, give...
This work theoretically studies stochastic neural networks, a main type of neural network in use. We...
It took until the last decade to finally see a machine match human performance on essentially any ta...
Given any deep fully connected neural network, initialized with random Gaussian parameters, we bound...
Deep neural networks have had tremendous success in a wide range of applications where they achieve ...
We make three related contributions motivated by the challenge of training stochastic neural network...
This article studies the infinite-width limit of deep feedforward neural networks whose weights are ...
This paper presents an empirical study regarding training probabilistic neural networks using traini...
We analyze feature learning in infinite-width neural networks trained with gradient flow through a s...
We focus on a specific class of shallow neural networks with a single hidden layer, namely those wit...
We consider the problem of learning a target function corresponding to a deep, extensive-width, non-...
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (G...
We develop a mathematically rigorous framework for multilayer neural networks in the mean field regi...
This thesis aims to study recent theoretical work in machine learning research that seeks to better ...
Recent studies have empirically investigated different methods to train stochastic neural networks o...
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, give...
This work theoretically studies stochastic neural networks, a main type of neural network in use. We...
It took until the last decade to finally see a machine match human performance on essentially any ta...
Given any deep fully connected neural network, initialized with random Gaussian parameters, we bound...
Deep neural networks have had tremendous success in a wide range of applications where they achieve ...
We make three related contributions motivated by the challenge of training stochastic neural network...
This article studies the infinite-width limit of deep feedforward neural networks whose weights are ...
This paper presents an empirical study regarding training probabilistic neural networks using traini...
We analyze feature learning in infinite-width neural networks trained with gradient flow through a s...
We focus on a specific class of shallow neural networks with a single hidden layer, namely those wit...
We consider the problem of learning a target function corresponding to a deep, extensive-width, non-...
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (G...
We develop a mathematically rigorous framework for multilayer neural networks in the mean field regi...
This thesis aims to study recent theoretical work in machine learning research that seeks to better ...