Discrete expectations arise in various machine learning tasks, and we often need to backpropagate the gradient through them. One domain is variational inference, where training discrete latent variable models requires gradient estimates of a high dimensional discrete distribution because we are backpropagating through discrete stochastic layer in a deep neural network. Another important area of research is a permutation or ranking based objective where the objective itself is discrete and non-differentiable. To tackle these problems, we propose ARMS, an antithetic REINFORCE-based Monte Carlo gradient estimator for three different discrete distributions: binary, categorical, and Plackett-Luce, where the last two are generalizations of the pr...
This thesis presents five contributions to machine learning, with themes of differentiability and Ba...
Modern machine learning models are complex, hierarchical, and large-scale and are trained using non-...
Uncertainty estimation in large deep-learning models is a computationally challenging task, where it...
Discrete expectations arise in various machine learning tasks, and we often need to backpropagate th...
Gradient estimation is often necessary for fitting generative models with discrete latent variables,...
By enabling correct differentiation in Stochastic Computation Graphs (SCGs), the infinitely differen...
Gradient estimation -- approximating the gradient of an expectation with respect to the parameters o...
The integration of discrete algorithmic components in deep learning architectures has numerous appli...
Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparame...
Learning models with discrete latent variables using stochastic gradient descent remains a challenge...
Learning models with discrete latent variables using stochastic gradient descent remains a challenge...
Stochastic gradient-based optimisation for discrete latent variable models is challenging due to the...
Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policy by f...
How can we perform efficient inference and learning in directed probabilistic models, in the presenc...
By enabling correct differentiation in stochastic computation graphs (SCGs), the infinitely differe...
This thesis presents five contributions to machine learning, with themes of differentiability and Ba...
Modern machine learning models are complex, hierarchical, and large-scale and are trained using non-...
Uncertainty estimation in large deep-learning models is a computationally challenging task, where it...
Discrete expectations arise in various machine learning tasks, and we often need to backpropagate th...
Gradient estimation is often necessary for fitting generative models with discrete latent variables,...
By enabling correct differentiation in Stochastic Computation Graphs (SCGs), the infinitely differen...
Gradient estimation -- approximating the gradient of an expectation with respect to the parameters o...
The integration of discrete algorithmic components in deep learning architectures has numerous appli...
Optimization with noisy gradients has become ubiquitous in statistics and machine learning. Reparame...
Learning models with discrete latent variables using stochastic gradient descent remains a challenge...
Learning models with discrete latent variables using stochastic gradient descent remains a challenge...
Stochastic gradient-based optimisation for discrete latent variable models is challenging due to the...
Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policy by f...
How can we perform efficient inference and learning in directed probabilistic models, in the presenc...
By enabling correct differentiation in stochastic computation graphs (SCGs), the infinitely differe...
This thesis presents five contributions to machine learning, with themes of differentiability and Ba...
Modern machine learning models are complex, hierarchical, and large-scale and are trained using non-...
Uncertainty estimation in large deep-learning models is a computationally challenging task, where it...