Model-Based Reinforcement Learning (MBRL) can greatly profit from using world models for estimating the consequences of selecting particular actions: an animat can construct such a model from its experiences and use it for computing rewarding behavior. We study the problem of collecting useful experiences through exploration in stochastic environments. Towards this end we use MBRL to maximize exploration rewards (in addition to environmental rewards) for visits of states that promise information gain. We also combine MBRL and the Interval Estimation algorithm (Kaelbling, 1993). Experimental results demonstrate the advantages of our approaches
Recent Reinforcement Learning (RL) algorithms, such as R-MAX, make (with high probability) only a sm...
We generalise the problem of reward modelling (RM) for reinforcement learning (RL) to handle non-Mar...
Deep exploration requires coordinated long-term planning. We present a model-based reinforcement le...
Model-Based Reinforcement Learning (MBRL) can greatly profit from using world models for estimating ...
Reinforcement learning can greatly prot from world models updated by experience and used for computi...
Reinforcement learning can greatly profit from world models updated by experience and used for comp...
Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the cur...
Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the cur...
Reinforcement learning systems are often concerned with balancing exploration of untested actions ag...
International audienceReinforcement learning (RL) is a paradigm for learning sequential decision mak...
The impetus for exploration in reinforcement learning (RL) is decreasing uncertainty about the envir...
One problem of current Reinforcement Learning algorithms is finding a balance between exploitation o...
International audienceRealistic environments often provide agents with very limited feedback. When t...
Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Huma...
This paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based...
Recent Reinforcement Learning (RL) algorithms, such as R-MAX, make (with high probability) only a sm...
We generalise the problem of reward modelling (RM) for reinforcement learning (RL) to handle non-Mar...
Deep exploration requires coordinated long-term planning. We present a model-based reinforcement le...
Model-Based Reinforcement Learning (MBRL) can greatly profit from using world models for estimating ...
Reinforcement learning can greatly prot from world models updated by experience and used for computi...
Reinforcement learning can greatly profit from world models updated by experience and used for comp...
Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the cur...
Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the cur...
Reinforcement learning systems are often concerned with balancing exploration of untested actions ag...
International audienceReinforcement learning (RL) is a paradigm for learning sequential decision mak...
The impetus for exploration in reinforcement learning (RL) is decreasing uncertainty about the envir...
One problem of current Reinforcement Learning algorithms is finding a balance between exploitation o...
International audienceRealistic environments often provide agents with very limited feedback. When t...
Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Huma...
This paper discusses parameter-based exploration methods for reinforcement learning. Parameter-based...
Recent Reinforcement Learning (RL) algorithms, such as R-MAX, make (with high probability) only a sm...
We generalise the problem of reward modelling (RM) for reinforcement learning (RL) to handle non-Mar...
Deep exploration requires coordinated long-term planning. We present a model-based reinforcement le...