© 2016 The Authors and IOS Press. Q-learning associates states and actions of a Markov Decision Process to expected future reward through online learning. In practice, however, when the state space is large and experience is still limited, the algorithm will not find a match between current state and experience unless some details describing states are ignored. On the other hand, reducing state information affects long term performance because decisions will need to be made on less informative inputs. We propose a variation of Q-learning that gradually enriches state descriptions, after enough experience is accumulated. This is coupled with an ad-hoc exploration strategy that aims at collecting key information that allows the algorithm to e...
In this study, we investigate the effects of conditioning Independent Q-Learners (IQL) not solely on...
This thesis presents a modified Q-learning algorithm and provides conditions for convergence to a pu...
A very general framework for modeling uncertainty in learning environments is given by Partially Obs...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
Q-learning can be used to find an optimal action-selection policy for any given finite Markov Decisi...
This thesis involves the use of a reinforcement learning algorithm (RL) called Q-learning to train a...
Abstract. Q-learning can be used to learn a control policy that max-imises a scalar reward through i...
We extend the Q-learning algorithm from the Markov Decision Process setting to problems where observ...
The paper analyzes one of the main reinforcement learning methods - Q-learning, which is actively us...
Value-based approaches to reinforcement learning (RL) maintain a value function that measures the lo...
Applying Q-Learning to multidimensional, real-valued state spaces is time-consuming in most cases. I...
Reinforcement learning has successfully been used in many applications and achieved prodigious perfo...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
Q-learning is a very popular reinforcement learning algorithm being proven to converge to optimal po...
Q-learning (QL) is a popular reinforcement learning algorithm that is guaranteed to converge to opti...
In this study, we investigate the effects of conditioning Independent Q-Learners (IQL) not solely on...
This thesis presents a modified Q-learning algorithm and provides conditions for convergence to a pu...
A very general framework for modeling uncertainty in learning environments is given by Partially Obs...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
Q-learning can be used to find an optimal action-selection policy for any given finite Markov Decisi...
This thesis involves the use of a reinforcement learning algorithm (RL) called Q-learning to train a...
Abstract. Q-learning can be used to learn a control policy that max-imises a scalar reward through i...
We extend the Q-learning algorithm from the Markov Decision Process setting to problems where observ...
The paper analyzes one of the main reinforcement learning methods - Q-learning, which is actively us...
Value-based approaches to reinforcement learning (RL) maintain a value function that measures the lo...
Applying Q-Learning to multidimensional, real-valued state spaces is time-consuming in most cases. I...
Reinforcement learning has successfully been used in many applications and achieved prodigious perfo...
This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-pro...
Q-learning is a very popular reinforcement learning algorithm being proven to converge to optimal po...
Q-learning (QL) is a popular reinforcement learning algorithm that is guaranteed to converge to opti...
In this study, we investigate the effects of conditioning Independent Q-Learners (IQL) not solely on...
This thesis presents a modified Q-learning algorithm and provides conditions for convergence to a pu...
A very general framework for modeling uncertainty in learning environments is given by Partially Obs...