We study the problem of learning a policy in a Markov decision process (MDP) based on observations of the actions taken by multiple teachers. We assume that the teachers are like-minded in that their reward functions -- while different from each other -- are random perturbations of an underlying reward function. Under this assumption, we demonstrate that inverse reinforcement learning algorithms that satisfy a certain property -- that of matching feature expectations -- yield policies that are approximately optimal with respect to the underlying reward function, and that no algorithm can do better in the worst case. We also show how to efficiently recover the optimal policy when the MDP has one state -- a setting that is akin to multi-armed...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
We address the problem of inverse reinforcement learning in Markov decision processes where the agen...
We consider learning in a Markov decision process where we are not explicitly given a reward functio...
In decision-making problems reward function plays an important role in finding the best policy. Rein...
International audienceThis paper deals with the Inverse Reinforcement Learning framework, whose purp...
We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonst...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
We consider the problem of learning by demonstration from agents acting in un- known stochastic Mark...
In this paper we study the question of life long learning of behaviors from human demonstrations by ...
Learning desirable behavior from a limited number of demonstrations, also known as inverse reinforce...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
Abstract. Inverse reinforcement learning (IRL) addresses the problem of recovering a task descriptio...
Inverse reinforcement learning (IRL) addresses the problem of re-covering the unknown reward functio...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
We address the problem of inverse reinforcement learning in Markov decision processes where the agen...
We consider learning in a Markov decision process where we are not explicitly given a reward functio...
In decision-making problems reward function plays an important role in finding the best policy. Rein...
International audienceThis paper deals with the Inverse Reinforcement Learning framework, whose purp...
We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonst...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
We consider the problem of learning by demonstration from agents acting in un- known stochastic Mark...
In this paper we study the question of life long learning of behaviors from human demonstrations by ...
Learning desirable behavior from a limited number of demonstrations, also known as inverse reinforce...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
Abstract. Inverse reinforcement learning (IRL) addresses the problem of recovering a task descriptio...
Inverse reinforcement learning (IRL) addresses the problem of re-covering the unknown reward functio...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
We address the problem of inverse reinforcement learning in Markov decision processes where the agen...