Conveying complex objectives to reinforcement learning (RL) agents often requires meticulous reward engineering. Preference-based RL methods are able to learn a more flexible reward model based on human preferences by actively incorporating human feedback, i.e. teacher's preferences between two clips of behaviors. However, poor feedback-efficiency still remains a problem in current preference-based RL algorithms, as tailored human feedback is very expensive. To handle this issue, previous methods have mainly focused on improving query selection and policy initialization. At the same time, recent exploration methods have proven to be a recipe for improving sample-efficiency in RL. We present an exploration method specifically for preference-...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
Reinforcement learning (RL) is one of the three basic paradigms of machine learning. It has demonstr...
Over the course of the last decade, the framework of reinforcement learning has developed into a pro...
Reinforcement learning (RL) techniques optimize the accumulated long-term reward of a suitably chose...
International audienceThis paper focuses on reinforcement learning (RL) with limited prior knowledge...
The utility of reinforcement learning is limited by the alignment of reward functions with the inter...
Common reinforcement learning algorithms assume access to a numeric feedback signal. The numeric fee...
Deploying learning systems in the real-world requires aligning their objectives with those of the hu...
We define a novel neuro-symbolic framework, argumentative reward learning, which combines preference...
The current reward learning from human preferences could be used to resolve complex reinforcement le...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
Practical implementations of deep reinforcement learning (deep RL) have been challenging due to an a...
In reinforcement learning (RL), it is challenging to learn directly from high-dimensional observatio...
Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tun...
To convey desired behavior to a Reinforcement Learning (RL) agent, a designer must choose a reward f...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
Reinforcement learning (RL) is one of the three basic paradigms of machine learning. It has demonstr...
Over the course of the last decade, the framework of reinforcement learning has developed into a pro...
Reinforcement learning (RL) techniques optimize the accumulated long-term reward of a suitably chose...
International audienceThis paper focuses on reinforcement learning (RL) with limited prior knowledge...
The utility of reinforcement learning is limited by the alignment of reward functions with the inter...
Common reinforcement learning algorithms assume access to a numeric feedback signal. The numeric fee...
Deploying learning systems in the real-world requires aligning their objectives with those of the hu...
We define a novel neuro-symbolic framework, argumentative reward learning, which combines preference...
The current reward learning from human preferences could be used to resolve complex reinforcement le...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
Practical implementations of deep reinforcement learning (deep RL) have been challenging due to an a...
In reinforcement learning (RL), it is challenging to learn directly from high-dimensional observatio...
Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tun...
To convey desired behavior to a Reinforcement Learning (RL) agent, a designer must choose a reward f...
Abstract. This paper focuses on reinforcement learning (RL) with lim-ited prior knowledge. In the do...
Reinforcement learning (RL) is one of the three basic paradigms of machine learning. It has demonstr...
Over the course of the last decade, the framework of reinforcement learning has developed into a pro...