In the search for more sample-efficient reinforcement-learning (RL) algorithms, a promising direction is to leverage as much external off-policy data as possible. For instance, expert demonstrations. In the past, multiple ideas have been proposed to make good use of the demonstrations added to the replay buffer, such as pretraining on demonstrations only or minimizing additional cost functions. We present a new method, able to leverage both demonstrations and episodes collected online in any sparse-reward environment with any off-policy algorithm. Our method is based on a reward bonus given to demonstrations and successful episodes (via relabeling), encouraging expert imitation and self-imitation. Our experiments focus on several robotic-ma...
In many sequential decision-making problems (e.g., robotics control, game playing, sequential predic...
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based an...
Imitation learning refers to a family of learning algorithms enabling the learning agents to learn d...
International audienceDuring recent years, deep reinforcement learning (DRL) has made successful inc...
Offline reinforcement learning (RL) can learn control policies from static datasets but, like standa...
We present an algorithm for Inverse Reinforcement Learning (IRL) from expert state observations only...
Inverse Reinforcement Learning (IRL) is attractive in scenarios where reward engineering can be tedi...
The current work on reinforcement learning (RL) from demonstrations often assumes the demonstrations...
Imitation Learning (IL) is a popular approach for teaching behavior policies to agents by demonstrat...
Reinforcement learning (RL) has demonstrated its superiority in solving sequential decision-making p...
In reinforcement learning (RL), it is challenging to learn directly from high-dimensional observatio...
Different from classic Supervised Learning, Reinforcement Learning (RL), is fundamentally interactiv...
Reinforcement learning (RL) provides a powerful framework for decision-making, but its application i...
The lottery ticket hypothesis questions the role of overparameterization in supervised deep learning...
A common problem in Reinforcement Learning (RL) is that the reward function is hard to express. This...
In many sequential decision-making problems (e.g., robotics control, game playing, sequential predic...
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based an...
Imitation learning refers to a family of learning algorithms enabling the learning agents to learn d...
International audienceDuring recent years, deep reinforcement learning (DRL) has made successful inc...
Offline reinforcement learning (RL) can learn control policies from static datasets but, like standa...
We present an algorithm for Inverse Reinforcement Learning (IRL) from expert state observations only...
Inverse Reinforcement Learning (IRL) is attractive in scenarios where reward engineering can be tedi...
The current work on reinforcement learning (RL) from demonstrations often assumes the demonstrations...
Imitation Learning (IL) is a popular approach for teaching behavior policies to agents by demonstrat...
Reinforcement learning (RL) has demonstrated its superiority in solving sequential decision-making p...
In reinforcement learning (RL), it is challenging to learn directly from high-dimensional observatio...
Different from classic Supervised Learning, Reinforcement Learning (RL), is fundamentally interactiv...
Reinforcement learning (RL) provides a powerful framework for decision-making, but its application i...
The lottery ticket hypothesis questions the role of overparameterization in supervised deep learning...
A common problem in Reinforcement Learning (RL) is that the reward function is hard to express. This...
In many sequential decision-making problems (e.g., robotics control, game playing, sequential predic...
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based an...
Imitation learning refers to a family of learning algorithms enabling the learning agents to learn d...