We study model-based reinforcement learning (RL) for episodic Markov decision processes (MDP) whose transition probability is parametrized by an unknown transition core with features of state and action. Despite much recent progress in analyzing algorithms in the linear MDP setting, the understanding of more general transition models is very restrictive. In this paper, we propose a provably efficient RL algorithm for the MDP whose state transition is given by a multinomial logistic model. We show that our proposed algorithm based on the upper confidence bounds achieves O(d√(H^3 T)) regret bound where d is the dimension of the transition core, H is the horizon, and T is the total number of steps. To the best of our knowledge, this is the fir...
We introduce a class of MPDs which greatly simplify Reinforcement Learning. They have discrete state...
We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-h...
Markov decision processes (MDPs) are an established frame-work for solving sequential decision-makin...
We study model-based reinforcement learning (RL) for episodic Markov decision processes (MDP) whose ...
In this paper, we study the problem of transferring the available Markov Decision Process (MDP) mode...
University of Minnesota M.S. thesis. June 2012. Major: Computer science. Advisor: Prof. Paul Schrate...
Reinforcement Learning (RL) in finite state and action Markov Decision Processes is studied with an ...
We introduce a class of Markov decision problems (MDPs) which greatly simplify Reinforcement Learnin...
This study is concerned with finite Markov decision processes (MDPs) whose state are exactly observa...
The standard RL world model is that of a Markov Decision Process (MDP). A basic premise of MDPs is t...
Abstract We consider the problem of learning the optimal action-value func-tion in discounted-reward...
Abstract The problem of reinforcement learning in a non-Markov environment isexplored using a dynami...
We consider model selection for classic Reinforcement Learning (RL) environments -- Multi Armed Band...
We present a class of metrics, defined on the state space of a finite Markov decision process (MDP)...
The problem of selecting the right state-representation in a reinforcement learning problem is consi...
We introduce a class of MPDs which greatly simplify Reinforcement Learning. They have discrete state...
We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-h...
Markov decision processes (MDPs) are an established frame-work for solving sequential decision-makin...
We study model-based reinforcement learning (RL) for episodic Markov decision processes (MDP) whose ...
In this paper, we study the problem of transferring the available Markov Decision Process (MDP) mode...
University of Minnesota M.S. thesis. June 2012. Major: Computer science. Advisor: Prof. Paul Schrate...
Reinforcement Learning (RL) in finite state and action Markov Decision Processes is studied with an ...
We introduce a class of Markov decision problems (MDPs) which greatly simplify Reinforcement Learnin...
This study is concerned with finite Markov decision processes (MDPs) whose state are exactly observa...
The standard RL world model is that of a Markov Decision Process (MDP). A basic premise of MDPs is t...
Abstract We consider the problem of learning the optimal action-value func-tion in discounted-reward...
Abstract The problem of reinforcement learning in a non-Markov environment isexplored using a dynami...
We consider model selection for classic Reinforcement Learning (RL) environments -- Multi Armed Band...
We present a class of metrics, defined on the state space of a finite Markov decision process (MDP)...
The problem of selecting the right state-representation in a reinforcement learning problem is consi...
We introduce a class of MPDs which greatly simplify Reinforcement Learning. They have discrete state...
We study reinforcement learning for continuous-time Markov decision processes (MDPs) in the finite-h...
Markov decision processes (MDPs) are an established frame-work for solving sequential decision-makin...