We study diverse skill discovery in reward-free environments, aiming to discover all possible skills in simple grid-world environments where prior methods have struggled to succeed. This problem is formulated as mutual training of skills using an intrinsic reward and a discriminator trained to predict a skill given its trajectory. Our initial solution replaces the standard one-vs-all (softmax) discriminator with a one-vs-one (all pairs) discriminator and combines it with a novel intrinsic reward function and a dropout regularization technique. The combined approach is named APART: Diverse Skill Discovery using All Pairs with Ascending Reward and Dropout. We demonstrate that APART discovers all the possible skills in grid worlds with remarka...
National audienceTraining virtual agents to play a game using reinforcement learning (RL) has gained...
An ability to adjust to changing environments and unforeseen circumstances is likely to be an import...
This paper presents a method by which a rein-forcement learning agent can automatically dis-cover ce...
While reinforcement learning has recently been able to achieve unprecedented success, it often comes...
Robot Reinforcement Learning (RL) algorithms return a policy that maximizes a global cumulative rew...
An efficient way for a deep reinforcement learning (DRL) agent to explore can be to learn a set of s...
Unsupervised skill learning objectives (Gregor et al., 2016, Eysenbach et al., 2018) allow agents to...
Reinforcement learning (RL) aims to learn optimal behaviors for agents to maximize cumulative reward...
This paper presents a method by which a reinforcement learning agent can automatically discover cert...
In many real-world problems, reward signals received by agents are delayed or sparse, which makes it...
Practising and honing skills forms a fundamental component of how humans learn, yet artificial agent...
Among the most impressive of aspects of human intelligence is skill acquisition—the ability to ident...
Current reinforcement learning (RL) in robotics often experiences difficulty in generalizing to new ...
Over the course of the last decade, the framework of reinforcement learning has developed into a pro...
Reinforcement Learning (RL) is an elegant approach to tackle sequential decision-making problems. In...
National audienceTraining virtual agents to play a game using reinforcement learning (RL) has gained...
An ability to adjust to changing environments and unforeseen circumstances is likely to be an import...
This paper presents a method by which a rein-forcement learning agent can automatically dis-cover ce...
While reinforcement learning has recently been able to achieve unprecedented success, it often comes...
Robot Reinforcement Learning (RL) algorithms return a policy that maximizes a global cumulative rew...
An efficient way for a deep reinforcement learning (DRL) agent to explore can be to learn a set of s...
Unsupervised skill learning objectives (Gregor et al., 2016, Eysenbach et al., 2018) allow agents to...
Reinforcement learning (RL) aims to learn optimal behaviors for agents to maximize cumulative reward...
This paper presents a method by which a reinforcement learning agent can automatically discover cert...
In many real-world problems, reward signals received by agents are delayed or sparse, which makes it...
Practising and honing skills forms a fundamental component of how humans learn, yet artificial agent...
Among the most impressive of aspects of human intelligence is skill acquisition—the ability to ident...
Current reinforcement learning (RL) in robotics often experiences difficulty in generalizing to new ...
Over the course of the last decade, the framework of reinforcement learning has developed into a pro...
Reinforcement Learning (RL) is an elegant approach to tackle sequential decision-making problems. In...
National audienceTraining virtual agents to play a game using reinforcement learning (RL) has gained...
An ability to adjust to changing environments and unforeseen circumstances is likely to be an import...
This paper presents a method by which a rein-forcement learning agent can automatically dis-cover ce...