Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 115-118).This thesis considers three complications that arise from applying reinforcement learning to a real-world application. In the process of using reinforcement learning to build an adaptive electronic market-maker, we find the sparsity of data, the partial observability of the domain, and the multiple objectives of the agent to cause serious problems for existing reinforcement learning algorithms. We employ importance sampling (likelihood ratios) to achieve good performance in partially observable Nlarkov decision processes with few data. Our importance sampling estimator requires n...
Sequential decision making from experience, or reinforcement learning (RL), is a paradigm that is we...
Likelihood ratio policy gradient methods have been some of the most successful reinforcement learnin...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
This thesis considers three complications that arise from applying reinforcement learning to a real-...
How can we effectively exploit the collected samples when solving a continuous control task with Rei...
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy ...
Abstract In this paper we analyze a particular issue of estimation, namely the estimation of the exp...
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past,...
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy ...
Stochastic processes are an important theoretical tool to model sequential phenomenon in the natural...
Abstract. Reinforcement learning means finding the optimal course of action in Markovian environment...
Reinforcement learning is a general computational framework for learning sequential decision strate...
We consider the transfer of experience samples in reinforcement learning. Most of the previous works...
The central question addressed in this research is ”can we define evaluation methodologies that enco...
A central challenge to applying many off-policy reinforcement learning algorithms to real world prob...
Sequential decision making from experience, or reinforcement learning (RL), is a paradigm that is we...
Likelihood ratio policy gradient methods have been some of the most successful reinforcement learnin...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
This thesis considers three complications that arise from applying reinforcement learning to a real-...
How can we effectively exploit the collected samples when solving a continuous control task with Rei...
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy ...
Abstract In this paper we analyze a particular issue of estimation, namely the estimation of the exp...
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past,...
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy ...
Stochastic processes are an important theoretical tool to model sequential phenomenon in the natural...
Abstract. Reinforcement learning means finding the optimal course of action in Markovian environment...
Reinforcement learning is a general computational framework for learning sequential decision strate...
We consider the transfer of experience samples in reinforcement learning. Most of the previous works...
The central question addressed in this research is ”can we define evaluation methodologies that enco...
A central challenge to applying many off-policy reinforcement learning algorithms to real world prob...
Sequential decision making from experience, or reinforcement learning (RL), is a paradigm that is we...
Likelihood ratio policy gradient methods have been some of the most successful reinforcement learnin...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...