Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).Comment: 20 pages, Accepted for publication in Stochastic System
We present new concentration of measure inequalities for Markov chains, generalising results for cha...
In this dissertation we study concentration properties of Markov chains,and sequential decision maki...
We study the problem of estimating the fixed point of a contractive operator defined on a separable ...
The stochastic approximation (SA) algorithm is a widely used probabilistic method for finding a zero...
Following the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitr...
Reinforcement learning is a framework for solving sequential decision-making problems without requir...
In this handout we analyse reinforcement learning algorithms for Markov decision processes. The read...
This paper develops an unified framework to study finite-sample convergence guarantees of a large cl...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms ...
We address the problem of computing the optimal Q-function in Markov decision prob-lems with infinit...
We study the mixing properties of an important optimization algorithm of machine learning: the stoch...
We are interested in understanding stability (almost sure boundedness) of stochastic approximation a...
We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approx...
We present new concentration of measure inequalities for Markov chains, generalising results for cha...
In this dissertation we study concentration properties of Markov chains,and sequential decision maki...
We study the problem of estimating the fixed point of a contractive operator defined on a separable ...
The stochastic approximation (SA) algorithm is a widely used probabilistic method for finding a zero...
Following the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitr...
Reinforcement learning is a framework for solving sequential decision-making problems without requir...
In this handout we analyse reinforcement learning algorithms for Markov decision processes. The read...
This paper develops an unified framework to study finite-sample convergence guarantees of a large cl...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
Includes bibliographical references (p. 18-20).Supported by the National Science Foundation. ECS-921...
Recent developments in the area of reinforcement learning have yielded a number of new algorithms ...
We address the problem of computing the optimal Q-function in Markov decision prob-lems with infinit...
We study the mixing properties of an important optimization algorithm of machine learning: the stoch...
We are interested in understanding stability (almost sure boundedness) of stochastic approximation a...
We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approx...
We present new concentration of measure inequalities for Markov chains, generalising results for cha...
In this dissertation we study concentration properties of Markov chains,and sequential decision maki...
We study the problem of estimating the fixed point of a contractive operator defined on a separable ...