Solving Markov decision processes (MDPs) efficiently is challenging in many cases, for example, when the state space or action space is large, when the reward function is sparse and delayed, and when there is a distribution of MDPs. Structures in the policy, value function, reward function, or state space can be useful in accelerating the learning process. In this thesis, we exploit structures in MDPs to solve them effectively and efficiently. First, we study problems with concave value function and basestock policy and leverage these two structures to propose an approximate dynamic programming (ADP) algorithm. Next, we study the exploration problem in unknown MDPs, introduce structured intrinsic reward to the problem, and propose a Bayes-o...
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (M...
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (M...
Problems involving optimal sequential making in uncertain dynamic systems arise in domains such as e...
This dissertation investigates the problem of representation discovery in discrete Markov decision p...
We present a hierarchical reinforcement learning framework that formulates each task in the hierarch...
University of Minnesota M.S. thesis. June 2012. Major: Computer science. Advisor: Prof. Paul Schrate...
The running time of the classical algorithms of the Markov Decision Process (MDP) typically grows li...
Address email We present an approximation scheme for solving Markov Decision Processes (MDPs) in whi...
This paper provides new techniques for abstracting the state space of a Markov Decision Process (MD...
Sequential decision making is a fundamental task faced by any intelligent agent in an extended inter...
In this paper we describe recent progress in our work on Value Function Discovery (VFD), a novel me...
In this paper we describe recent progress in our work on Value Function Discovery (VFD), a novel me...
We introduce a class of Markov decision problems (MDPs) which greatly simplify Reinforcement Learnin...
This chapter presents an overview of simulation-based techniques useful for solving Markov decision ...
Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI r...
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (M...
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (M...
Problems involving optimal sequential making in uncertain dynamic systems arise in domains such as e...
This dissertation investigates the problem of representation discovery in discrete Markov decision p...
We present a hierarchical reinforcement learning framework that formulates each task in the hierarch...
University of Minnesota M.S. thesis. June 2012. Major: Computer science. Advisor: Prof. Paul Schrate...
The running time of the classical algorithms of the Markov Decision Process (MDP) typically grows li...
Address email We present an approximation scheme for solving Markov Decision Processes (MDPs) in whi...
This paper provides new techniques for abstracting the state space of a Markov Decision Process (MD...
Sequential decision making is a fundamental task faced by any intelligent agent in an extended inter...
In this paper we describe recent progress in our work on Value Function Discovery (VFD), a novel me...
In this paper we describe recent progress in our work on Value Function Discovery (VFD), a novel me...
We introduce a class of Markov decision problems (MDPs) which greatly simplify Reinforcement Learnin...
This chapter presents an overview of simulation-based techniques useful for solving Markov decision ...
Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI r...
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (M...
This paper addresses the problem of planning under uncertainty in large Markov Decision Processes (M...
Problems involving optimal sequential making in uncertain dynamic systems arise in domains such as e...