Numerous models for supervised and reinforcement learning benefit from combinations of discrete and continuous model components. End-to-end learnable discrete-continuous models are compositional, tend to generalize better, and are more interpretable. A popular approach to building discrete-continuous computation graphs is that of integrating discrete probability distributions into neural networks using stochastic softmax tricks. Prior work has mainly focused on computation graphs with a single discrete component on each of the graph's execution paths. We analyze the behavior of more complex stochastic computations graphs with multiple sequential discrete components. We show that it is challenging to optimize the parameters of these models, ...
The efficacy of a biological synapse is naturally bounded, and at some resolution, and is discrete a...
Most stochastic gradient descent algorithms can optimize neural networks that are sub-differentiable...
This paper reviews some of the recent results in applying the theory of Probably Approximately Corre...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
The reparameterization trick enables optimizing large scale stochastic computation graphs via gradie...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
Motivation and background The enormous amount of capabilities that every human learns throughout hi...
In this dissertation novel techniques for inference and learning of and decision-making in probabili...
Abstract. Graph-based domain representations have been used in discrete rein-forcement learning doma...
Graphs are an essential topic in machine learning. In this proposal, we explore problems in graphica...
Thesis (Ph.D.)--University of Washington, 2019The study of probabilistic graphical models (PGMs) is ...
We introduce the stochastic gradient descent algorithm used in the computational network toolkit (CN...
Permutations and matchings are core building blocks in a variety of latent variable models, as they ...
Continuous relaxations play an important role in discrete optimization, but have not seen much use i...
The efficacy of a biological synapse is naturally bounded, and at some resolution, and is discrete a...
Most stochastic gradient descent algorithms can optimize neural networks that are sub-differentiable...
This paper reviews some of the recent results in applying the theory of Probably Approximately Corre...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
The reparameterization trick enables optimizing large scale stochastic computation graphs via gradie...
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Comp...
The paper studies a stochastic extension of continuous recurrent neural networks and analyzes gradie...
Motivation and background The enormous amount of capabilities that every human learns throughout hi...
In this dissertation novel techniques for inference and learning of and decision-making in probabili...
Abstract. Graph-based domain representations have been used in discrete rein-forcement learning doma...
Graphs are an essential topic in machine learning. In this proposal, we explore problems in graphica...
Thesis (Ph.D.)--University of Washington, 2019The study of probabilistic graphical models (PGMs) is ...
We introduce the stochastic gradient descent algorithm used in the computational network toolkit (CN...
Permutations and matchings are core building blocks in a variety of latent variable models, as they ...
Continuous relaxations play an important role in discrete optimization, but have not seen much use i...
The efficacy of a biological synapse is naturally bounded, and at some resolution, and is discrete a...
Most stochastic gradient descent algorithms can optimize neural networks that are sub-differentiable...
This paper reviews some of the recent results in applying the theory of Probably Approximately Corre...