Q-learning is a popular reinforcement learning algorithm. This algorithm has however been studied and analysed mainly in the infinite horizon setting. There are several important applications which can be modeled in the framework of finite horizon Markov decision processes. We develop a version of Q-learning algorithm for finite horizon Markov decision processes (MDP) and provide a full proof of its stability and convergence. Our analysis of stability and convergence of finite horizon Q-learning is based entirely on the ordinary differential equations (O.D.E) method. We also demonstrate the performance of our algorithm on a setting of random MDP as well as on an application on smart grids
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms ...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
The infinite horizon setting is widely adopted for problems of reinforcement learning (RL). These in...
We develop a simulation based algorithm for finite horizon Markov decision processes with finite sta...
We develop a simulation based algorithm for finite horizon Markov decision processes with finite sta...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
© 2018 Curran Associates Inc.All rights reserved. We consider model-free reinforcement learning for...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms ...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision ...
The infinite horizon setting is widely adopted for problems of reinforcement learning (RL). These in...
We develop a simulation based algorithm for finite horizon Markov decision processes with finite sta...
We develop a simulation based algorithm for finite horizon Markov decision processes with finite sta...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
© 2018 Curran Associates Inc.All rights reserved. We consider model-free reinforcement learning for...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
In the field of Reinforcement Learning, Markov Decision Processes with a finite number of states and...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms ...