To solve partially observable Markov decision problems, we introduce HQ-learning, a hierarchical extension of Q-learning. HQ-learning is based on an ordered sequence of subagents, each learning to identify and solve a Markovian subtask of the total task. Each agent learns (1) an appropriate subgoal (though there is no intermediate, external reinforcement for "good" subgoals), and (2) a Markovian policy, given a particular subgoal. Our experiments demonstrate: (a) The system can easily solve tasks standard Q-learning cannot solve at all. (b) It can solve partially observable mazes with more states than those used in most previous POMDP work. (c) It can quickly solve complex tasks that require manipulation of the environment to free...
A very general framework for modeling uncertainty in learning environments is given by Partially Obs...
The theory of partially observable Markov decision processes (POMDPs) is a useful tool for developin...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...
HQ-learning is a hierarchical extension of Q()-learning designed to solve certain types of partially...
HQ-learning is a hierarchical extension of Q()-learning designed to solve certain types of partially...
* This research was partially supported by the Latvian Science Foundation under grant No.02-86d.Effi...
We present a hierarchical reinforcement learning framework that formulates each task in the hierarch...
This thesis addresses the open problem of automatically discovering hierarchical structure in reinfo...
Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning contr...
Abstract. Reinforcement learning is bedeviled by the curse of dimensionality: the number of paramete...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
Reinforcement learning (RL) is an area of Machine Learning (ML) concerned with learning how a softwa...
People are efficient when they make decisions under uncertainty, even when their decisions have long...
Weakly-coupled Markov decision processes can be decomposed into subprocesses that interact only thro...
This paper considers the problem of intelligent agent functioning in non-Markovian environments. We ...
A very general framework for modeling uncertainty in learning environments is given by Partially Obs...
The theory of partially observable Markov decision processes (POMDPs) is a useful tool for developin...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...
HQ-learning is a hierarchical extension of Q()-learning designed to solve certain types of partially...
HQ-learning is a hierarchical extension of Q()-learning designed to solve certain types of partially...
* This research was partially supported by the Latvian Science Foundation under grant No.02-86d.Effi...
We present a hierarchical reinforcement learning framework that formulates each task in the hierarch...
This thesis addresses the open problem of automatically discovering hierarchical structure in reinfo...
Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning contr...
Abstract. Reinforcement learning is bedeviled by the curse of dimensionality: the number of paramete...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
Reinforcement learning (RL) is an area of Machine Learning (ML) concerned with learning how a softwa...
People are efficient when they make decisions under uncertainty, even when their decisions have long...
Weakly-coupled Markov decision processes can be decomposed into subprocesses that interact only thro...
This paper considers the problem of intelligent agent functioning in non-Markovian environments. We ...
A very general framework for modeling uncertainty in learning environments is given by Partially Obs...
The theory of partially observable Markov decision processes (POMDPs) is a useful tool for developin...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...