Inverse Reinforcement Learning (IRL) is attractive in scenarios where reward engineering can be tedious. However, prior IRL algorithms use on-policy transitions, which require intensive sampling from the current policy for stable and optimal performance. This limits IRL applications in the real world, where environment interactions can become highly expensive. To tackle this problem, we present Off-Policy Inverse Reinforcement Learning (OPIRL), which (1) adopts off-policy data distribution instead of on-policy and enables significant reduction of the number of interactions with the environment, (2) learns a stationary reward function that is transferable with high generalization capabilities on changing dynamics, and (3) leverages mode-cove...
We propose a novel formulation for the Inverse Reinforcement Learning (IRL) problem, which jointly a...
Offline reinforcement learning enables learning from a fixed dataset, without further interactions w...
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based an...
Reinforcement Learning (RL) is an effective approach to solve sequential decision making problems wh...
We present an algorithm for Inverse Reinforcement Learning (IRL) from expert state observations only...
A major challenge faced by machine learning community is the decision making problems under uncertai...
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function R from a policy pi. To...
Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learni...
The goal of the inverse reinforcement learning (IRL) problem is to recover the reward functions from...
This paper proposes model-free imitation learning named Entropy-Regularized Imitation Learning (ERIL...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
Offline reinforcement learning (RL) provides a promising direction to exploit the massive amount of ...
© 2020, Springer Science+Business Media, LLC, part of Springer Nature. Inverse reinforcement learnin...
Based on the premise that the most succinct representation of the behavior of an entity is its rewar...
Marginalized importance sampling (MIS), which measures the density ratio between the state-action oc...
We propose a novel formulation for the Inverse Reinforcement Learning (IRL) problem, which jointly a...
Offline reinforcement learning enables learning from a fixed dataset, without further interactions w...
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based an...
Reinforcement Learning (RL) is an effective approach to solve sequential decision making problems wh...
We present an algorithm for Inverse Reinforcement Learning (IRL) from expert state observations only...
A major challenge faced by machine learning community is the decision making problems under uncertai...
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function R from a policy pi. To...
Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learni...
The goal of the inverse reinforcement learning (IRL) problem is to recover the reward functions from...
This paper proposes model-free imitation learning named Entropy-Regularized Imitation Learning (ERIL...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
Offline reinforcement learning (RL) provides a promising direction to exploit the massive amount of ...
© 2020, Springer Science+Business Media, LLC, part of Springer Nature. Inverse reinforcement learnin...
Based on the premise that the most succinct representation of the behavior of an entity is its rewar...
Marginalized importance sampling (MIS), which measures the density ratio between the state-action oc...
We propose a novel formulation for the Inverse Reinforcement Learning (IRL) problem, which jointly a...
Offline reinforcement learning enables learning from a fixed dataset, without further interactions w...
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-based an...