A construct that has been receiving attention recently in reinforcement learning is stochastic factorization (SF), a particular case of non-negative factorization (NMF) in which the matrices involved are stochastic. The idea is to use SF to approximate the transition matrices of a Markov decision process (MDP). This is useful for two reasons. First, learning the factors of the SF instead of the transition matrices can reduce significantly the number of parameters to be estimated. Second, it has been shown that SF can be used to reduce the number of operations needed to compute an MDP's value function. Recently, an algorithm called expectation-maximization SF (EMSF) has been proposed to compute a SF directly from transitions sampled from an ...
<p>This dissertation describes sequential decision making problems in non-stationary environments. O...
Abstract The problem of reinforcement learning in a non-Markov environment isexplored using a dynami...
Reinforcement learning is a general computational framework for learning sequential decision strate...
Nonnegative matrix factorization (NMF) has become a popular dimension-reduction method and has been ...
This chapter presents and evaluates an online representation selection method for factored Markov de...
Reinforcement learning deals with the problem of sequential decision making in uncertain stochastic ...
Kernel-based reinforcement learning (KBRL) stands out among approximate re-inforcement learning algo...
Fitted Q-iteration (FQI) stands out among reinforcement learning algorithms for its flexibility and ...
Recent decision-theoric planning algorithms are able to find optimal solutions in large problems, us...
Using the expression for the unnormalized nonlinear filter for a hidden Markov model, we develop a d...
The Markov decision process (MDP) formulation used to model many real-world sequential decision maki...
The Markov decision process (MDP) formulation used to model many real-world sequential decision maki...
Reinforcement Learning (RL) in either fully or partially observable domains usually poses a requirem...
Recent decision-theoric planning algorithms are able to find optimal solutions in large problems, us...
We consider the problem of minimizing the long term average expected regret of an agent in an online...
<p>This dissertation describes sequential decision making problems in non-stationary environments. O...
Abstract The problem of reinforcement learning in a non-Markov environment isexplored using a dynami...
Reinforcement learning is a general computational framework for learning sequential decision strate...
Nonnegative matrix factorization (NMF) has become a popular dimension-reduction method and has been ...
This chapter presents and evaluates an online representation selection method for factored Markov de...
Reinforcement learning deals with the problem of sequential decision making in uncertain stochastic ...
Kernel-based reinforcement learning (KBRL) stands out among approximate re-inforcement learning algo...
Fitted Q-iteration (FQI) stands out among reinforcement learning algorithms for its flexibility and ...
Recent decision-theoric planning algorithms are able to find optimal solutions in large problems, us...
Using the expression for the unnormalized nonlinear filter for a hidden Markov model, we develop a d...
The Markov decision process (MDP) formulation used to model many real-world sequential decision maki...
The Markov decision process (MDP) formulation used to model many real-world sequential decision maki...
Reinforcement Learning (RL) in either fully or partially observable domains usually poses a requirem...
Recent decision-theoric planning algorithms are able to find optimal solutions in large problems, us...
We consider the problem of minimizing the long term average expected regret of an agent in an online...
<p>This dissertation describes sequential decision making problems in non-stationary environments. O...
Abstract The problem of reinforcement learning in a non-Markov environment isexplored using a dynami...
Reinforcement learning is a general computational framework for learning sequential decision strate...