In this paper we derive convergence rates for Q-learning. We show an interesting rela-tionship between the convergence rate and the learning rate used in Q-learning. For a polynomial learning rate, one which is 1/tω at time t where ω ∈ (1/2, 1), we show that the convergence rate is polynomial in 1/(1 − γ), where γ is the discount factor. In contrast we show that for a linear learning rate, one which is 1/t at time t, the convergence rate has an exponential dependence on 1/(1 − γ). In addition we show a simple example that proves this exponential behavior is inherent for linear learning rates
We consider the solution of discounted optimal stopping problems using linear function approximation...
We present in this article a two-timescale variant of Q-learning with linear function approximation....
International audienceThere are two well known Stochastic Approximation techniques that are known to...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
We introduce a new convergent variant of Q-learning, called speedy Q-learning, to address the proble...
International audienceWe introduce a new convergent variant of Q-learning, called speedy Q-learning,...
In Reinforcement learning, Q-learning is the best-known algorithm but it suffers from overestimation...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
Q(λ)learning uses TD(λ)methods to accelerate Q-learning. The worst case complexity for a single upda...
This paper discusses agents7 learning on a market. The price level evolves through a multivariable a...
The method of temporal differences (TD) is one way of making consistent predictions about the future...
This paper observes that the perceptron algorithm converges under exponentially increasing learning ...
A recent line of works, initiated by Russo and Xu, has shown that the generalization error of a lear...
We provide a bound on the first moment of the error in the Q-function estimate resulting from fixed ...
Q-learning (QL) is a popular reinforcement learning algorithm that is guaranteed to converge to opti...
We consider the solution of discounted optimal stopping problems using linear function approximation...
We present in this article a two-timescale variant of Q-learning with linear function approximation....
International audienceThere are two well known Stochastic Approximation techniques that are known to...
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Ma...
We introduce a new convergent variant of Q-learning, called speedy Q-learning, to address the proble...
International audienceWe introduce a new convergent variant of Q-learning, called speedy Q-learning,...
In Reinforcement learning, Q-learning is the best-known algorithm but it suffers from overestimation...
We consider the problem of model-free reinforcement learning in the Markovian decision processes (MD...
Q(λ)learning uses TD(λ)methods to accelerate Q-learning. The worst case complexity for a single upda...
This paper discusses agents7 learning on a market. The price level evolves through a multivariable a...
The method of temporal differences (TD) is one way of making consistent predictions about the future...
This paper observes that the perceptron algorithm converges under exponentially increasing learning ...
A recent line of works, initiated by Russo and Xu, has shown that the generalization error of a lear...
We provide a bound on the first moment of the error in the Q-function estimate resulting from fixed ...
Q-learning (QL) is a popular reinforcement learning algorithm that is guaranteed to converge to opti...
We consider the solution of discounted optimal stopping problems using linear function approximation...
We present in this article a two-timescale variant of Q-learning with linear function approximation....
International audienceThere are two well known Stochastic Approximation techniques that are known to...