International audienceWe consider the reward-free exploration framework introduced by Jin et al. (2020), where an RL agent interacts with an unknown environment without any explicit reward function to maximize. The objective is to collect enough information during the exploration phase, so that a near-optimal policy can be immediately computed once any reward function is provided. In this paper, we move from the finite-horizon setting studied by Jin et al. (2020) to the more general setting of goalconditioned RL, often referred to as stochastic shortest path (SSP). We first discuss the challenges specific to SSPs and then study two scenarios: 1) reward-free goal-free exploration in communicating MDPs, and 2) reward-free goal-free incrementa...
While parallelism has been extensively used in Reinforcement Learning (RL), the quantitative effects...
This paper presents a model allowing to tune continual exploration in an optimal way by integrating ...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...
International audienceWe consider the reward-free exploration framework introduced by Jin et al. (20...
International audienceMany popular reinforcement learning problems (e.g., navigation in a maze, some...
Exploration is essential for reinforcement learning (RL). To face the challenges of exploration, we ...
International audienceWe investigate the exploration of an unknown environment when no reward functi...
Goal-oriented Reinforcement Learning, where the agent needs to reach the goal state while simultaneo...
We revisit the incremental autonomous exploration problem proposed by Lim & Auer (2012). In this set...
International audienceWe study the problem of learning in the stochastic shortest path (SSP) setting...
Reward-free reinforcement learning (RL) considers the setting where the agent does not have access t...
We introduce the Black Hole Reinforcement Learning problem, a previously unexplored variant of reinf...
In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that ...
International audienceRealistic environments often provide agents with very limited feedback. When t...
Reward optimization in fully observable Markov decision processes is equivalent to a linear program ...
While parallelism has been extensively used in Reinforcement Learning (RL), the quantitative effects...
This paper presents a model allowing to tune continual exploration in an optimal way by integrating ...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...
International audienceWe consider the reward-free exploration framework introduced by Jin et al. (20...
International audienceMany popular reinforcement learning problems (e.g., navigation in a maze, some...
Exploration is essential for reinforcement learning (RL). To face the challenges of exploration, we ...
International audienceWe investigate the exploration of an unknown environment when no reward functi...
Goal-oriented Reinforcement Learning, where the agent needs to reach the goal state while simultaneo...
We revisit the incremental autonomous exploration problem proposed by Lim & Auer (2012). In this set...
International audienceWe study the problem of learning in the stochastic shortest path (SSP) setting...
Reward-free reinforcement learning (RL) considers the setting where the agent does not have access t...
We introduce the Black Hole Reinforcement Learning problem, a previously unexplored variant of reinf...
In a reward-free environment, what is a suitable intrinsic objective for an agent to pursue so that ...
International audienceRealistic environments often provide agents with very limited feedback. When t...
Reward optimization in fully observable Markov decision processes is equivalent to a linear program ...
While parallelism has been extensively used in Reinforcement Learning (RL), the quantitative effects...
This paper presents a model allowing to tune continual exploration in an optimal way by integrating ...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...