Reinforcement learning is a framework for solving sequential decision-making problems without requiring the environmental model, and is viewed as a promising approach to achieve artificial intelligence. However, there is a huge gap between the empirical successes and the theoretical understanding of reinforcement learning. In this thesis, we make an effort to bridging such gap. More formally, this thesis focuses on designing data-efficient reinforcement learning algorithms and establishing their finite-sample guarantees. Specifically, we aim at answering the following question: suppose we carry out some reinforcement learning algorithm with finite amount of samples (or with finite number of iterations), then what can we say about the perfo...
Following the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitr...
We introduce the first temporal-difference learning algorithm that is stable with linear function ap...
We consider a nonlinear discrete stochastic control system, and our goal is to design a feedback con...
This paper develops an unified framework to study finite-sample convergence guarantees of a large cl...
This paper investigates to what extent one can improve reinforcement learning algorithms. Our study ...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
In this handout we analyse reinforcement learning algorithms for Markov decision processes. The read...
International audienceAlong with the sharp increase in visibility of the field, the rate at which ne...
In this paper, we present a brief survey of reinforcement learning, with particular emphasis on stoc...
We address the problem of computing the optimal Q-function in Markov decision prob-lems with infinit...
Reinforcement learning deals with the problem of sequential decision making in uncertain stochastic ...
Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived f...
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
Reinforcement learning is a general computational framework for learning sequential decision strate...
With the increasing need for handling large state and action spaces, general function approximation ...
Following the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitr...
We introduce the first temporal-difference learning algorithm that is stable with linear function ap...
We consider a nonlinear discrete stochastic control system, and our goal is to design a feedback con...
This paper develops an unified framework to study finite-sample convergence guarantees of a large cl...
This paper investigates to what extent one can improve reinforcement learning algorithms. Our study ...
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptot...
In this handout we analyse reinforcement learning algorithms for Markov decision processes. The read...
International audienceAlong with the sharp increase in visibility of the field, the rate at which ne...
In this paper, we present a brief survey of reinforcement learning, with particular emphasis on stoc...
We address the problem of computing the optimal Q-function in Markov decision prob-lems with infinit...
Reinforcement learning deals with the problem of sequential decision making in uncertain stochastic ...
Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived f...
Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impa...
Reinforcement learning is a general computational framework for learning sequential decision strate...
With the increasing need for handling large state and action spaces, general function approximation ...
Following the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitr...
We introduce the first temporal-difference learning algorithm that is stable with linear function ap...
We consider a nonlinear discrete stochastic control system, and our goal is to design a feedback con...