Comparing Bayesian neural networks (BNNs) with different widths is challenging because, as the width increases, multiple model properties change simultaneously, and, inference in the finite-width case is intractable. In this work, we empirically compare finite- and infinite-width BNNs, and provide quantitative and qualitative explanations for their performance difference. We find that when the model is mis-specified, increasing width can hurt BNN performance. In these cases, we provide evidence that finite-width BNNs generalize better partially due to the properties of their frequency spectrum that allows them to adapt under model mismatch
The excellent real-world performance of deep neural networks has received increasing attention. Desp...
The ability to output accurate predictive uncertainty estimates is vital to a reliable classifier. S...
Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters to a DNN t...
Recent works have suggested that finite Bayesian neural networks may sometimes outperform their infi...
Bayesian neural networks are theoretically well-understood only in the infinite-width limit, where G...
International audienceThe connection between Bayesian neural networks and Gaussian processes gained ...
It took until the last decade to finally see a machine match human performance on essentially any ta...
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, give...
As its width tends to infinity, a deep neural network's behavior under gradient descent can become s...
The limit of infinite width allows for substantial simplifications in the analytical study of overpa...
In practice, multi-task learning (through learning features shared among tasks) is an essential prop...
The need to avoid confident predictions on unfamiliar data has sparked interest in out-of-distributi...
Understanding how feature learning affects generalization is among the foremost goals of modern deep...
We investigate deep Bayesian neural networks with Gaussian priors on the weights and ReLU-like nonli...
We prove in this paper that optimizing wide ReLU neural networks (NNs) with at least one hidden laye...
The excellent real-world performance of deep neural networks has received increasing attention. Desp...
The ability to output accurate predictive uncertainty estimates is vital to a reliable classifier. S...
Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters to a DNN t...
Recent works have suggested that finite Bayesian neural networks may sometimes outperform their infi...
Bayesian neural networks are theoretically well-understood only in the infinite-width limit, where G...
International audienceThe connection between Bayesian neural networks and Gaussian processes gained ...
It took until the last decade to finally see a machine match human performance on essentially any ta...
The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, give...
As its width tends to infinity, a deep neural network's behavior under gradient descent can become s...
The limit of infinite width allows for substantial simplifications in the analytical study of overpa...
In practice, multi-task learning (through learning features shared among tasks) is an essential prop...
The need to avoid confident predictions on unfamiliar data has sparked interest in out-of-distributi...
Understanding how feature learning affects generalization is among the foremost goals of modern deep...
We investigate deep Bayesian neural networks with Gaussian priors on the weights and ReLU-like nonli...
We prove in this paper that optimizing wide ReLU neural networks (NNs) with at least one hidden laye...
The excellent real-world performance of deep neural networks has received increasing attention. Desp...
The ability to output accurate predictive uncertainty estimates is vital to a reliable classifier. S...
Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters to a DNN t...