10 pages, 5 figures, ICML'19 conferenceInternational audienceWe investigate deep Bayesian neural networks with Gaussian priors on the weights and a class of ReLU-like nonlinearities. Bayesian neural networks with Gaussian priors are well known to induce an L2, “weight decay”, regularization. Our results indicate a more intricate regularization effect at the level of the unit activations. Our main result establishes that the induced prior distribution on the units before and after activation becomes increasingly heavy-tailed with the depth of the layer. We show that first layer units are Gaussian, second layer units are sub-exponential, and units in deeper layers are characterized by sub-Weibull distributions. Our results provide new theoret...
Understanding how feature learning affects generalization is among the foremost goals of modern deep...
Stochastic variational inference for Bayesian deep neural network (DNN) requires specifying priors a...
This paper introduces a new neural network based prior for real valued functions on $\mathbb R^d$ wh...
10 pages, 5 figures, ICML'19 conferenceInternational audienceWe investigate deep Bayesian neural net...
International audienceWe investigate deep Bayesian neural networks with Gaussian priors on the weigh...
We investigate deep Bayesian neural networks with Gaussian priors on the weights and ReLU-like nonli...
International audienceWe investigate deep Bayesian neural networks with Gaussian priors on the weigh...
International audienceThe connection between Bayesian neural networks and Gaussian processes gained ...
Deep neural networks have bested notable benchmarks across computer vision, reinforcement learning, ...
International audienceThe connection between Bayesian neural networks and Gaussian processes gained ...
Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. Ho...
Bayesian inference is known to provide a general framework for incorporating prior knowledge or spec...
The Bayesian treatment of neural networks dictates that a prior distribution is specified over their...
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approx...
In recent years, Neural Networks (NN) have become a popular data-analytic tool in Statistics, Compu...
Understanding how feature learning affects generalization is among the foremost goals of modern deep...
Stochastic variational inference for Bayesian deep neural network (DNN) requires specifying priors a...
This paper introduces a new neural network based prior for real valued functions on $\mathbb R^d$ wh...
10 pages, 5 figures, ICML'19 conferenceInternational audienceWe investigate deep Bayesian neural net...
International audienceWe investigate deep Bayesian neural networks with Gaussian priors on the weigh...
We investigate deep Bayesian neural networks with Gaussian priors on the weights and ReLU-like nonli...
International audienceWe investigate deep Bayesian neural networks with Gaussian priors on the weigh...
International audienceThe connection between Bayesian neural networks and Gaussian processes gained ...
Deep neural networks have bested notable benchmarks across computer vision, reinforcement learning, ...
International audienceThe connection between Bayesian neural networks and Gaussian processes gained ...
Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. Ho...
Bayesian inference is known to provide a general framework for incorporating prior knowledge or spec...
The Bayesian treatment of neural networks dictates that a prior distribution is specified over their...
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approx...
In recent years, Neural Networks (NN) have become a popular data-analytic tool in Statistics, Compu...
Understanding how feature learning affects generalization is among the foremost goals of modern deep...
Stochastic variational inference for Bayesian deep neural network (DNN) requires specifying priors a...
This paper introduces a new neural network based prior for real valued functions on $\mathbb R^d$ wh...