Training neural networks with discrete stochastic variables presents a unique challenge. Backpropagation is not directly applicable, nor are the reparameterization tricks used in networks with continuous stochastic variables. To address this challenge, we present Hindsight Network Credit Assignment (HNCA), a novel gradient estimation algorithm for networks of discrete stochastic units. HNCA works by assigning credit to each unit based on the degree to which its output influences its immediate children in the network. We prove that HNCA produces unbiased gradient estimates with reduced variance compared to the REINFORCE estimator, while the computational cost is similar to that of backpropagation. We first apply HNCA in a contextual bandit s...
This is the final version of the article. It first appeared from International Conference on Learnin...
Learning in biological and artificial neural networks is often framed as a problem in which targeted...
Many connectionist learning algorithms consists of minimizing a cost of the form C(w) = E(J(z; w)) ...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
Deep reinforcement learning approaches have shown impressive results in a variety of different domai...
The success of deep learning ignited interest in whether the brain learns hierarchical representatio...
Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural ...
Stochastic binary hidden units in a multi-layer perceptron (MLP) network give at least three potenti...
Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural ...
Numerous models for supervised and reinforcement learning benefit from combinations of discrete and ...
We discuss a novel strategy for training neural networks using sequential Monte Carlo algorithms and...
Motivated by the goal of enabling energy-efficient and/or lower-cost hardware implementations of dee...
The revival of multilayer neural networks in the mid 80's originated from the discovery of the ...
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based o...
This is the final version of the article. It first appeared from International Conference on Learnin...
Learning in biological and artificial neural networks is often framed as a problem in which targeted...
Many connectionist learning algorithms consists of minimizing a cost of the form C(w) = E(J(z; w)) ...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
Deep reinforcement learning approaches have shown impressive results in a variety of different domai...
The success of deep learning ignited interest in whether the brain learns hierarchical representatio...
Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural ...
Stochastic binary hidden units in a multi-layer perceptron (MLP) network give at least three potenti...
Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural ...
Numerous models for supervised and reinforcement learning benefit from combinations of discrete and ...
We discuss a novel strategy for training neural networks using sequential Monte Carlo algorithms and...
Motivated by the goal of enabling energy-efficient and/or lower-cost hardware implementations of dee...
The revival of multilayer neural networks in the mid 80's originated from the discovery of the ...
We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based o...
This is the final version of the article. It first appeared from International Conference on Learnin...
Learning in biological and artificial neural networks is often framed as a problem in which targeted...
Many connectionist learning algorithms consists of minimizing a cost of the form C(w) = E(J(z; w)) ...