In Reinforcement Learning, legible behavior requires to maintain a policy that is easily discernable from a set of other policies. While legibility has been thoroughly addressed in Explainable Planning, little work exists in the Reinforcement Learning literature. As we propose in this paper, injecting legible behavior inside an agent's policy doesn't require to modify components of its learning algorithm. Rather, the agent's optimal policy can be regularized for legibility, by evaluating how the policy may produce observations that that would make an observer to infer an incorrect policy. In our formulation, the decision boundary introduced by legibility impacts the states in which the agent's policy returns an action that has high likeliho...
Choosing actions within norm-regulated environments involves balancing achieving one’s goals and cop...
Policy constraint methods to offline reinforcement learning (RL) typically utilize parameterization ...
We present a model-based offline reinforcement learning policy performance lower bound that explicit...
In this paper we propose a method to augment a Reinforcement Learning agent with legibility. This me...
In this paper we investigate the notion of legibility in sequential decision tasks under uncertainty...
Though reinforcement learning has greatly benefited from the incorporation of neural networks, the i...
© 2018 Curran Associates Inc.All rights reserved. While deep reinforcement learning has successfully...
We study the problem of generating interpretable and verifiable policies for Reinforcement Learning ...
Today’s advanced Reinforcement Learning algorithms produce black-box policies, that are often diffic...
Abstract. In some reinforcement learning problems an agent may be provided with a set of input polic...
We contribute Policy Reuse as a technique to improve a re-inforcement learning agent with guidance f...
In applying reinforcement learning to agents acting in the real world we are often faced with tasks ...
This paper addresses the issue of interpretability and auditability of reinforcement-learning agents...
Offline reinforcement learning -- learning a policy from a batch of data -- is known to be hard for ...
In this paper, we describe how techniques from reinforcement learning might be used to approach the ...
Choosing actions within norm-regulated environments involves balancing achieving one’s goals and cop...
Policy constraint methods to offline reinforcement learning (RL) typically utilize parameterization ...
We present a model-based offline reinforcement learning policy performance lower bound that explicit...
In this paper we propose a method to augment a Reinforcement Learning agent with legibility. This me...
In this paper we investigate the notion of legibility in sequential decision tasks under uncertainty...
Though reinforcement learning has greatly benefited from the incorporation of neural networks, the i...
© 2018 Curran Associates Inc.All rights reserved. While deep reinforcement learning has successfully...
We study the problem of generating interpretable and verifiable policies for Reinforcement Learning ...
Today’s advanced Reinforcement Learning algorithms produce black-box policies, that are often diffic...
Abstract. In some reinforcement learning problems an agent may be provided with a set of input polic...
We contribute Policy Reuse as a technique to improve a re-inforcement learning agent with guidance f...
In applying reinforcement learning to agents acting in the real world we are often faced with tasks ...
This paper addresses the issue of interpretability and auditability of reinforcement-learning agents...
Offline reinforcement learning -- learning a policy from a batch of data -- is known to be hard for ...
In this paper, we describe how techniques from reinforcement learning might be used to approach the ...
Choosing actions within norm-regulated environments involves balancing achieving one’s goals and cop...
Policy constraint methods to offline reinforcement learning (RL) typically utilize parameterization ...
We present a model-based offline reinforcement learning policy performance lower bound that explicit...