Learning Exploration Strategies in Model-Based Reinforcement Learning

Hester, Todd
Stone, Peter
Lopes, Manuel

Publication date

May 2013

Publisher

American College of Medical Physics (ACMP)

Abstract

International audienceReinforcement learning (RL) is a paradigm for learning sequential decision making tasks. However, typically the user must hand-tune exploration parameters for each different domain and/or algorithm that they are using. In this work, we present an algorithm called leo for learning these exploration strategies on-line. This algorithm makes use of bandit-type algorithms to adaptively select exploration strategies based on the rewards received when following them. We show empirically that this method performs well across a set of five domains. In contrast, for a given algorithm, no set of parameters is best across all domains. Our results demonstrate that the leo algorithm successfully learns the best exploration strategie...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Learning Exploration Strategies in Model-Based Reinforcement Learning

Abstract

Extracted data

Learning Exploration Strategies in Model-Based Reinforcement Learning

Abstract

Extracted data

Related items

Related items