This thesis considers three complications that arise from applying reinforcement learning to a real-world application. In the process of using reinforcement learning to build an adaptive electronic market-maker, we find the sparsity of data, the partial observability of the domain, and the multiple objectives of the agent to cause serious problems for existing reinforcement learning algorithms. We employ importance sampling (likelihood ratios) to achieve good performance in partially observable Markov decision processes with few data. Our importance sampling estimator requires no knowledge about the environment and places few restrictions on the method of collecting data. It can be used efficiently with reactive controllers, finite-state c...
Marginalized importance sampling (MIS), which measures the density ratio between the state-action oc...
In this master thesis, we have tried to solve two of most prominent Reinforcement Learning problems:...
The quintessential model-based reinforcement-learning agent iteratively refines its estimates or pri...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
International audienceIn reinforcement learning, an agent collects information interacting with an e...
How can we effectively exploit the collected samples when solving a continuous control task with Rei...
Learning from interaction with the environment -- trying untested actions, observing successes and f...
Applying the reinforcement learning methodology to domains that involve risky decisions like medicin...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy ...
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past,...
Stochastic processes are an important theoretical tool to model sequential phenomenon in the natural...
In the advent of Big Data and Machine Learning, there is a demand for improved decision making in un...
An important step in the design of autonomous systems is to evaluate the probability that a failure ...
Reinforcement Learning (RL) is the field of research focused on solving sequential decision-making t...
Marginalized importance sampling (MIS), which measures the density ratio between the state-action oc...
In this master thesis, we have tried to solve two of most prominent Reinforcement Learning problems:...
The quintessential model-based reinforcement-learning agent iteratively refines its estimates or pri...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
International audienceIn reinforcement learning, an agent collects information interacting with an e...
How can we effectively exploit the collected samples when solving a continuous control task with Rei...
Learning from interaction with the environment -- trying untested actions, observing successes and f...
Applying the reinforcement learning methodology to domains that involve risky decisions like medicin...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy ...
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past,...
Stochastic processes are an important theoretical tool to model sequential phenomenon in the natural...
In the advent of Big Data and Machine Learning, there is a demand for improved decision making in un...
An important step in the design of autonomous systems is to evaluate the probability that a failure ...
Reinforcement Learning (RL) is the field of research focused on solving sequential decision-making t...
Marginalized importance sampling (MIS), which measures the density ratio between the state-action oc...
In this master thesis, we have tried to solve two of most prominent Reinforcement Learning problems:...
The quintessential model-based reinforcement-learning agent iteratively refines its estimates or pri...