Inverse reinforcement learning attempts to reconstruct the reward function in a Markov decision problem, using observations of agent actions. As already observed in Russell [1998] the problem is ill-posed, and the reward function is not identifiable, even under the presence of perfect information about optimal behavior. We provide a resolution to this non-identifiability for problems with entropy regularization. For a given environment, we fully characterize the reward functions leading to a given policy and demonstrate that, given demonstrations of actions for the same reward under two distinct discount factors, or under sufficiently different environments, the unobserved reward can be recovered up to a constant. We also give general neces...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
We make decisions to maximize our perceived reward, but handcrafting a reward function for an autono...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
While Reinforcement Learning (RL) aims to train an agent from a reward function in a given environme...
Inverse reinforcement learning (IRL) addresses the problem of re-covering the unknown reward functio...
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function R from a policy pi. To...
Reinforcement Learning (RL) is an effective approach to solve sequential decision making problems wh...
International audienceThis paper deals with the Inverse Reinforcement Learning framework, whose purp...
We consider learning in a Markov decision process where we are not explicitly given a reward functio...
Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert...
Inverse Reinforcement Learning (IRL) aims to recover a reward function from expert demonstrations in...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
The goal of inverse reinforcement learning is to find a reward function for a Markov decision proces...
We study the problem of learning a policy in a Markov decision process (MDP) based on observations o...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
We make decisions to maximize our perceived reward, but handcrafting a reward function for an autono...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
While Reinforcement Learning (RL) aims to train an agent from a reward function in a given environme...
Inverse reinforcement learning (IRL) addresses the problem of re-covering the unknown reward functio...
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function R from a policy pi. To...
Reinforcement Learning (RL) is an effective approach to solve sequential decision making problems wh...
International audienceThis paper deals with the Inverse Reinforcement Learning framework, whose purp...
We consider learning in a Markov decision process where we are not explicitly given a reward functio...
Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert...
Inverse Reinforcement Learning (IRL) aims to recover a reward function from expert demonstrations in...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
The goal of inverse reinforcement learning is to find a reward function for a Markov decision proces...
We study the problem of learning a policy in a Markov decision process (MDP) based on observations o...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
We make decisions to maximize our perceived reward, but handcrafting a reward function for an autono...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...