Data-driven and learning-based methodologies have been very popular in modern decision-making systems. In order to make optimal use of data and computational resources, these problems require theoretically sound procedures for choosing between estimators, tuning their parameters, and understanding bias/variance trade-offs. In many settings, asymptotic and/or worst-case theory fails to provide the relevant guidance.In this dissertation, I present some recent advances that involve a more refined approach, one that leads to non-asymptotic and instance-optimal guarantees. Focusing on function approximation methods for policy evaluation in reinforcement learning, in Part I, I describe a novel class of optimal oracle inequalities for projected Be...
This paper introduces new optimality-preserving operators on Q-functions. We first describe an opera...
We consider batch reinforcement learning problems in continuous space, expected total discounted-rew...
Consider a given value function on states of a Markov decision problem, as might result from applyin...
We propose and analyze a reinforcement learning principle that approximates the Bellman equations by...
There are several reinforcement learning algorithms that yield ap-proximate solutions for the proble...
This paper introduces a set of algorithms for Monte-Carlo Bayesian reinforcement learning. Firstly, ...
Sample-efficient offline reinforcement learning (RL) with linear function approximation has been stu...
International audienceThis paper aims at theoretically and empirically comparing two standard optimi...
In learning problems, avoiding to overfit the training data is of fundamental importance in order to...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
This thesis is mostly focused on reinforcement learning, which is viewed as an optimization problem:...
International audienceReinforcement learning (RL) is a machine learning answer to the optimal contro...
We consider the problem of minimizing the long term average expected regret of an agent in an online...
Modern technological advances have prompted massive scale data collection in manymodern fields such ...
This paper addresses the problem of automatic generation of features for value function approximatio...
This paper introduces new optimality-preserving operators on Q-functions. We first describe an opera...
We consider batch reinforcement learning problems in continuous space, expected total discounted-rew...
Consider a given value function on states of a Markov decision problem, as might result from applyin...
We propose and analyze a reinforcement learning principle that approximates the Bellman equations by...
There are several reinforcement learning algorithms that yield ap-proximate solutions for the proble...
This paper introduces a set of algorithms for Monte-Carlo Bayesian reinforcement learning. Firstly, ...
Sample-efficient offline reinforcement learning (RL) with linear function approximation has been stu...
International audienceThis paper aims at theoretically and empirically comparing two standard optimi...
In learning problems, avoiding to overfit the training data is of fundamental importance in order to...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
This thesis is mostly focused on reinforcement learning, which is viewed as an optimization problem:...
International audienceReinforcement learning (RL) is a machine learning answer to the optimal contro...
We consider the problem of minimizing the long term average expected regret of an agent in an online...
Modern technological advances have prompted massive scale data collection in manymodern fields such ...
This paper addresses the problem of automatic generation of features for value function approximatio...
This paper introduces new optimality-preserving operators on Q-functions. We first describe an opera...
We consider batch reinforcement learning problems in continuous space, expected total discounted-rew...
Consider a given value function on states of a Markov decision problem, as might result from applyin...