The reparameterization trick enables optimizing large scale stochastic computation graphs via gradient descent. The essence of the trick is to refactor each stochastic node into a differentiable function of its parameters and a random variable with fixed distribution. After refactoring, the gradients of the loss propagated by the chain rule through the graph are low variance unbiased estimators of the gradients of the expected loss. While many continuous random variables have such reparameterizations, discrete random variables lack useful reparameterizations due to the discontinuous nature of discrete states. In this work we introduce CONCRETE random variables—CONtinuous relaxations of disCRETE random variables. The Concrete distribution is...
We present a tree-based reparameterization framework for the approximate estimation of stochastic pr...
We introduce the stochastic gradient descent algorithm used in the computational network toolkit (CN...
Slides of a talk given at Dortmund University, Dept. of Statistics, on March 2015 the 11th. Invitati...
The reparameterization trick enables optimizing large scale stochastic computation graphs via gradie...
Numerous models for supervised and reinforcement learning benefit from combinations of discrete and ...
Modelers use automatic differentiation (AD) of computation graphs to implement complex deep learning...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
We examine properties of the Concrete (or Gumbel-softmax) distribution on the simplex. Using the nat...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
By enabling correct differentiation in Stochastic Computation Graphs (SCGs), the infinitely differen...
Discrete expectations arise in various machine learning tasks, and we often need to backpropagate th...
Continuous relaxations play an important role in discrete optimization, but have not seen much use i...
Many connectionist learning algorithms consists of minimizing a cost of the form C(w) = E(J(z; w)) ...
Neural network quantization has become an important research area due to its great impact on deploym...
International audienceBackpropagating gradients through random variables is at the heart of numerous...
We present a tree-based reparameterization framework for the approximate estimation of stochastic pr...
We introduce the stochastic gradient descent algorithm used in the computational network toolkit (CN...
Slides of a talk given at Dortmund University, Dept. of Statistics, on March 2015 the 11th. Invitati...
The reparameterization trick enables optimizing large scale stochastic computation graphs via gradie...
Numerous models for supervised and reinforcement learning benefit from combinations of discrete and ...
Modelers use automatic differentiation (AD) of computation graphs to implement complex deep learning...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
We examine properties of the Concrete (or Gumbel-softmax) distribution on the simplex. Using the nat...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
By enabling correct differentiation in Stochastic Computation Graphs (SCGs), the infinitely differen...
Discrete expectations arise in various machine learning tasks, and we often need to backpropagate th...
Continuous relaxations play an important role in discrete optimization, but have not seen much use i...
Many connectionist learning algorithms consists of minimizing a cost of the form C(w) = E(J(z; w)) ...
Neural network quantization has become an important research area due to its great impact on deploym...
International audienceBackpropagating gradients through random variables is at the heart of numerous...
We present a tree-based reparameterization framework for the approximate estimation of stochastic pr...
We introduce the stochastic gradient descent algorithm used in the computational network toolkit (CN...
Slides of a talk given at Dortmund University, Dept. of Statistics, on March 2015 the 11th. Invitati...