We address the problem of computing the optimal Q-function in Markov decision prob-lems with infinite state-space. We analyze the convergence properties of several vari-ations of Q-learning when combined with function approximation, extending the anal-ysis of TD-learning in (Tsitsiklis & Van Roy, 1996a) to stochastic control settings. We identify conditions under which such approx-imate methods converge with probability 1. We conclude with a brief discussion on the general applicability of our results and com-pare them with several related works. 1
Reinforcement learning (RL) is a computational framework for learning sequential decision strategies...
We propose for risk-sensitive control of finite Markov chains a counterpart of the popular Q-learnin...
The paper investigates the possibility of applying value function based reinforcement learn-ing (RL)...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
In this paper, we present a brief survey of reinforcement learning, with particular emphasis on stoc...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms fo...
Abstract — Q-learning is a technique used to compute an opti-mal policy for a controlled Markov chai...
A linear function approximation-based reinforcement learning algorithm is proposed for Markov decisi...
A linear function approximation-based reinforcement learning algorithm is proposed for Markov decisi...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
In this work we propose an approach for generalization in continuous domain Reinforcement Learning t...
Reinforcement learning is a general computational framework for learning sequential decision strate...
Semi-Markov Decision Problems are continuous time generalizations of discrete time Markov Decision P...
Reinforcement learning (RL) is a computational framework for learning sequential decision strategies...
We propose for risk-sensitive control of finite Markov chains a counterpart of the popular Q-learnin...
The paper investigates the possibility of applying value function based reinforcement learn-ing (RL)...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
In this paper, we present a brief survey of reinforcement learning, with particular emphasis on stoc...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms fo...
Abstract — Q-learning is a technique used to compute an opti-mal policy for a controlled Markov chai...
A linear function approximation-based reinforcement learning algorithm is proposed for Markov decisi...
A linear function approximation-based reinforcement learning algorithm is proposed for Markov decisi...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
In this work we propose an approach for generalization in continuous domain Reinforcement Learning t...
Reinforcement learning is a general computational framework for learning sequential decision strate...
Semi-Markov Decision Problems are continuous time generalizations of discrete time Markov Decision P...
Reinforcement learning (RL) is a computational framework for learning sequential decision strategies...
We propose for risk-sensitive control of finite Markov chains a counterpart of the popular Q-learnin...
The paper investigates the possibility of applying value function based reinforcement learn-ing (RL)...