The paper deals with learning probability distributions of observed data by artificial neural networks. We suggest a so-called gradient conjugate prior (GCP) update appropriate for neural networks, which is a modification of the classical Bayesian update for conjugate priors. We establish a connection between the gradient conjugate prior update and the maximization of the log-likelihood of the predictive distribution. Unlike for the Bayesian neural networks, we use deterministic weights of neural networks, but rather assume that the ground truth distribution is normal with unknown mean and variance and learn by the neural networks the parameters of a prior (normal-gamma distribution) for these unknown mean and variance. The update of the pa...
We propose a novel approach for nonlinear regression using a two-layer neural network (NN) model str...
Gradient descent and instantaneous gradient descent learning rules are popular methods for training ...
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based o...
We study the dynamics and equilibria induced by training an artificial neural network for regression...
Deep neural networks have bested notable benchmarks across computer vision, reinforcement learning, ...
The Bayesian treatment of neural networks dictates that a prior distribution is specified over their...
Bayesian inference is known to provide a general framework for incorporating prior knowledge or spec...
Since the discovery of the back-propagation method, many modified and new algorithms have been propo...
Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. Ho...
While many implementations of Bayesian neural networks use large, complex hierarchical priors, in mu...
In this study, we focus on feed-forward neural networks with a single hidden layer. The research tou...
In recent years, Neural Networks (NN) have become a popular data-analytic tool in Statistics, Compu...
Bayesian treatments of learning in neural networks are typically based either on a local Gaussian ap...
International audienceWe investigate deep Bayesian neural networks with Gaussian priors on the weigh...
In stochastic gradient descent (SGD) and its variants, the optimized gradient estimators may be as e...
We propose a novel approach for nonlinear regression using a two-layer neural network (NN) model str...
Gradient descent and instantaneous gradient descent learning rules are popular methods for training ...
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based o...
We study the dynamics and equilibria induced by training an artificial neural network for regression...
Deep neural networks have bested notable benchmarks across computer vision, reinforcement learning, ...
The Bayesian treatment of neural networks dictates that a prior distribution is specified over their...
Bayesian inference is known to provide a general framework for incorporating prior knowledge or spec...
Since the discovery of the back-propagation method, many modified and new algorithms have been propo...
Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. Ho...
While many implementations of Bayesian neural networks use large, complex hierarchical priors, in mu...
In this study, we focus on feed-forward neural networks with a single hidden layer. The research tou...
In recent years, Neural Networks (NN) have become a popular data-analytic tool in Statistics, Compu...
Bayesian treatments of learning in neural networks are typically based either on a local Gaussian ap...
International audienceWe investigate deep Bayesian neural networks with Gaussian priors on the weigh...
In stochastic gradient descent (SGD) and its variants, the optimized gradient estimators may be as e...
We propose a novel approach for nonlinear regression using a two-layer neural network (NN) model str...
Gradient descent and instantaneous gradient descent learning rules are popular methods for training ...
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based o...