Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent reward functions, hence catering to more realistic and complex experts ’ behaviors. Solving our gener-alized IRL problem thus involves not only learning these reward functions but also the stochastic transitions between them at any state (including unvisited states). By representing our IRL problem with a probabilistic graphical model, an expectation-maximization (EM) algorithm can be devised to iteratively learn the different reward functions and the...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated tra...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
Learning desirable behavior from a limited number of demonstrations, also known as inverse reinforce...
The problem of learning an expert’s unknown reward function using a limited number of demonstrations...
The problem of learning an expert’s unknown reward function using a limited number of demonstrations...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...
The goal of the inverse reinforcement learning (IRL) problem is to recover the reward functions from...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated tra...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
Learning desirable behavior from a limited number of demonstrations, also known as inverse reinforce...
The problem of learning an expert’s unknown reward function using a limited number of demonstrations...
The problem of learning an expert’s unknown reward function using a limited number of demonstrations...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some...
This work handles the inverse reinforcement learning (IRL) problem where only a small number of demo...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...
The goal of the inverse reinforcement learning (IRL) problem is to recover the reward functions from...
International audienceInverse Reinforcement Learning (IRL) is an effective approach to recover a rew...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...
International audienceThis paper reports theoretical and empirical results obtained for the score-ba...