We consider the problem of imitation learning where the examples, demonstrated by an expert, cover only a small part of a large state space. Inverse Reinforcement Learning (IRL) provides an efficient tool for generalizing the demonstration, based on the assumption that the expert is optimally acting in a Markov Decision Process (MDP). Most of the past work on IRL requires that a (near)-optimal policy can be computed for different reward functions. However, this requirement can hardly be satisfied in systems with a large, or continuous, state space. In this paper, we propose a model-free IRL algorithm, where the relative entropy between the empirical distribution of the state-action trajectories under a uniform policy and their distribution ...
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated tra...
Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated tr...
Aiming at the problem that traditional inverse reinforcement learning algorithms are slow,imprecise,...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
Recent research has shown the benefit of framing problems of imitation learning as solutions to Mark...
We provide new perspectives and inference algorithms for Maximum Entropy (MaxEnt) Inverse Reinforcem...
Recent research has shown the benefit of framing problems of imitation learning as solutions to Mark...
We make decisions to maximize our perceived reward, but handcrafting a reward function for an autono...
We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov...
In decision-making problems reward function plays an important role in finding the best policy. Rein...
Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimi...
This paper proposes model-free imitation learning named Entropy-Regularized Imitation Learning (ERIL...
We study the problem of learning a policy in a Markov decision process (MDP) based on observations o...
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated tra...
Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated tr...
Aiming at the problem that traditional inverse reinforcement learning algorithms are slow,imprecise,...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
We consider the problem of imitation learning where the examples, demonstrated by an expert, cover o...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
Recent research has shown the benefit of framing problems of imitation learning as solutions to Mark...
We provide new perspectives and inference algorithms for Maximum Entropy (MaxEnt) Inverse Reinforcem...
Recent research has shown the benefit of framing problems of imitation learning as solutions to Mark...
We make decisions to maximize our perceived reward, but handcrafting a reward function for an autono...
We consider the problem of learning by demonstration from agents acting in unknown stochastic Markov...
In decision-making problems reward function plays an important role in finding the best policy. Rein...
Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimi...
This paper proposes model-free imitation learning named Entropy-Regularized Imitation Learning (ERIL...
We study the problem of learning a policy in a Markov decision process (MDP) based on observations o...
Existing inverse reinforcement learning (IRL) algorithms have assumed each expert’s demonstrated tra...
Existing inverse reinforcement learning (IRL) algorithms have assumed each ex-pert’s demonstrated tr...
Aiming at the problem that traditional inverse reinforcement learning algorithms are slow,imprecise,...