International audienceLeveraging an equivalence property in the state-space of a Markov Decision Process (MDP) has been investigated in several studies. This paper studies equivalence structure in the reinforcement learning (RL) setup, where transition distributions are no longer assumed to be known. We present a notion of similarity between transition probabilities of various state-action pairs of an MDP, which naturally defines an equivalence structure in the state-action space. We present equivalence-aware confidence sets for the case where the learner knows the underlying structure in advance. These sets are provably smaller than their corresponding equivalence-oblivious counterparts. In the more challenging case of an unknown equivalen...
With the increasing need for handling large state and action spaces, general function approximation ...
This thesis focuses on reinforcement learning (RL) which is a machine learning paradigm under which ...
International audienceWe study the role of the representation of state-action value functions in reg...
Leveraging an equivalence property on the set of states of state-action pairs in anMarkov Decision P...
International audienceWe consider the problem of online reinforcement learning when several state re...
International audienceWe consider an agent interacting with an environment in a single stream of act...
International audienceWe consider a reinforcement learning setting where the learner does not have e...
We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. ...
In Reinforcement Learning (RL), regret guarantees scaling with the square root of the time horizon h...
We consider a Reinforcement Learning setup without any (esp. MDP) assumptions on the environment. St...
We study upper and lower bounds on the sample-complexity of learning near-optimal behaviour in finit...
International audienceThe problem of reinforcement learning in an unknown and discrete Markov Decisi...
We study model-based reinforcement learning (RL) for episodic Markov decision processes (MDP) whose ...
Animals are able to rapidly infer from limited experience when sets of state action pairs have equiv...
Reinforcement learning (RL) has gained an increasing interest in recent years, being expected to del...
With the increasing need for handling large state and action spaces, general function approximation ...
This thesis focuses on reinforcement learning (RL) which is a machine learning paradigm under which ...
International audienceWe study the role of the representation of state-action value functions in reg...
Leveraging an equivalence property on the set of states of state-action pairs in anMarkov Decision P...
International audienceWe consider the problem of online reinforcement learning when several state re...
International audienceWe consider an agent interacting with an environment in a single stream of act...
International audienceWe consider a reinforcement learning setting where the learner does not have e...
We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. ...
In Reinforcement Learning (RL), regret guarantees scaling with the square root of the time horizon h...
We consider a Reinforcement Learning setup without any (esp. MDP) assumptions on the environment. St...
We study upper and lower bounds on the sample-complexity of learning near-optimal behaviour in finit...
International audienceThe problem of reinforcement learning in an unknown and discrete Markov Decisi...
We study model-based reinforcement learning (RL) for episodic Markov decision processes (MDP) whose ...
Animals are able to rapidly infer from limited experience when sets of state action pairs have equiv...
Reinforcement learning (RL) has gained an increasing interest in recent years, being expected to del...
With the increasing need for handling large state and action spaces, general function approximation ...
This thesis focuses on reinforcement learning (RL) which is a machine learning paradigm under which ...
International audienceWe study the role of the representation of state-action value functions in reg...