Learning desirable behavior from a limited number of demonstrations, also known as inverse reinforcement learning, is a challenging task in machine learning. I apply maximum likelihood estimation to the problem of inverse reinforcement learning, and show that it quickly and successfully identifies the unknown reward function from traces of optimal or near-optimal behavior, under the assumption that the reward function is a linear function of a known set of features. I extend this approach to cover reward functions that are a generalized function of the features, and show that the generalized inverse reinforcement learning approach is a competitive alternative to existing approaches covering the same class of functions, while at the same tim...
We consider learning in a Markov decision process where we are not explicitly given a reward functio...
In real-world applications, inferring the intentions of expert agents (e.g., human operators) can be...
Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for poli...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated tra...
Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated tr...
Abstract. Inverse reinforcement learning (IRL) addresses the problem of recovering a task descriptio...
In this paper we study the question of life long learning of behaviors from human demonstrations by ...
Based on the premise that the most succinct representation of the behavior of an entity is its rewar...
The problem of learning an expert’s unknown reward function using a limited number of demonstrations...
In decision-making problems reward function plays an important role in finding the best policy. Rein...
This purpose of this paper is to provide an overview of the theoretical background and applications ...
International audienceThis paper deals with the Inverse Reinforcement Learning framework, whose purp...
Reinforcement Learning (RL) methods provide a solution for decision-making problems under uncertaint...
A major challenge faced by machine learning community is the decision making problems under uncertai...
We consider learning in a Markov decision process where we are not explicitly given a reward functio...
In real-world applications, inferring the intentions of expert agents (e.g., human operators) can be...
Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for poli...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated tra...
Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated tr...
Abstract. Inverse reinforcement learning (IRL) addresses the problem of recovering a task descriptio...
In this paper we study the question of life long learning of behaviors from human demonstrations by ...
Based on the premise that the most succinct representation of the behavior of an entity is its rewar...
The problem of learning an expert’s unknown reward function using a limited number of demonstrations...
In decision-making problems reward function plays an important role in finding the best policy. Rein...
This purpose of this paper is to provide an overview of the theoretical background and applications ...
International audienceThis paper deals with the Inverse Reinforcement Learning framework, whose purp...
Reinforcement Learning (RL) methods provide a solution for decision-making problems under uncertaint...
A major challenge faced by machine learning community is the decision making problems under uncertai...
We consider learning in a Markov decision process where we are not explicitly given a reward functio...
In real-world applications, inferring the intentions of expert agents (e.g., human operators) can be...
Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for poli...