Abstract. Reinforcement learning aims to derive an optimal pol-icy for an often initially unknown environment. In the case of an unknown environment, exploration is used to acquire knowledge about it. In that context the well-known exploration-exploitation dilemma arises—when should one stop to explore and instead ex-ploit the knowledge already gathered? In this paper we propose an uncertainty-based exploration method. We use uncertainty propaga-tion to obtain the Q-function’s uncertainty and then use the uncer-tainty in combination with the Q-values to guide the exploration to promising states that so far have been insufficiently explored. The uncertainty’s weight during action selection can be influenced by a parameter. We evaluate one va...
Decision theory addresses the task of choosing an action; it provides robust decision-making criteri...
It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient...
It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient...
Deep, model based reinforcement learning has shown state of the art, human-exceeding performance in ...
Reinforcement learning systems are often concerned with balancing exploration of untested actions ag...
This paper studies directed exploration for reinforcement learning agents by tracking uncertainty ab...
Uncertainty is ubiquitous in games, both in the agents playing games and often in the games themselv...
The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL...
The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL...
The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL...
Sequential decision tasks with incomplete information are characterized by the exploration problem; ...
Abstract—In this paper we address the reliability of policies derived by Reinforcement Learning on a...
Offline reinforcement learning, or learning from a fixed data set, is an attractive alternative to o...
In many Reinforcement Learning (RL) tasks, the classical online interaction of the learning agent wi...
Handling uncertainty is an important part of decision-making. Leveraging uncertainty for guiding exp...
Decision theory addresses the task of choosing an action; it provides robust decision-making criteri...
It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient...
It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient...
Deep, model based reinforcement learning has shown state of the art, human-exceeding performance in ...
Reinforcement learning systems are often concerned with balancing exploration of untested actions ag...
This paper studies directed exploration for reinforcement learning agents by tracking uncertainty ab...
Uncertainty is ubiquitous in games, both in the agents playing games and often in the games themselv...
The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL...
The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL...
The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL...
Sequential decision tasks with incomplete information are characterized by the exploration problem; ...
Abstract—In this paper we address the reliability of policies derived by Reinforcement Learning on a...
Offline reinforcement learning, or learning from a fixed data set, is an attractive alternative to o...
In many Reinforcement Learning (RL) tasks, the classical online interaction of the learning agent wi...
Handling uncertainty is an important part of decision-making. Leveraging uncertainty for guiding exp...
Decision theory addresses the task of choosing an action; it provides robust decision-making criteri...
It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient...
It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient...