We define TTD-MDPs, a novel class of Markov decision processes where the traditional goal of an agent is changed from finding an optimal trajectory through a state space to realizing a specified distribution of trajectories through the space. After motivating this formulation, we show how to convert a traditional MDP into a TTD-MDP. We derive an algorithm for finding non-deterministic policies by constructing a trajectory tree that allows us to compute locally-consistent policies. We specify the necessary conditions for solving the problem exactly and present a heuristic algorithm for constructing policies when an exact answer is impossible or impractical. We present empirical results for our algorithm in two domains: a synthetic grid world...
We propose a new approach to the problem of searching a space of stochastic controllers for a Markov...
Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision...
Markov Decision Processes (MDP) are a mathematical formalism of many domains of artifical intelligen...
We define TTD-MDPs, a novel class of Markov decision processes where the traditional goal of an agen...
Markov decision processes (MDP) [1] provide a mathe-matical framework for studying a wide range of o...
We describe an extension of the Markov decision process model in which a continuous time dimension i...
Markov Decision Processes (MDPs) have been extensively studied and used in the context of planning a...
We describe an extension of the Markov decision process model in which a continuous time dimension i...
International audienceWe investigate the classical active pure exploration problem in Markov Decisio...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
International audienceWe investigate the classical active pure exploration problem in Markov Decisio...
International audienceWe investigate the classical active pure exploration problem in Markov Decisio...
This paper presents a new problem solving approach that is able to generate optimal policy solution ...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
We propose a new method for learning policies for large, partially observable Markov decision proces...
We propose a new approach to the problem of searching a space of stochastic controllers for a Markov...
Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision...
Markov Decision Processes (MDP) are a mathematical formalism of many domains of artifical intelligen...
We define TTD-MDPs, a novel class of Markov decision processes where the traditional goal of an agen...
Markov decision processes (MDP) [1] provide a mathe-matical framework for studying a wide range of o...
We describe an extension of the Markov decision process model in which a continuous time dimension i...
Markov Decision Processes (MDPs) have been extensively studied and used in the context of planning a...
We describe an extension of the Markov decision process model in which a continuous time dimension i...
International audienceWe investigate the classical active pure exploration problem in Markov Decisio...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
International audienceWe investigate the classical active pure exploration problem in Markov Decisio...
International audienceWe investigate the classical active pure exploration problem in Markov Decisio...
This paper presents a new problem solving approach that is able to generate optimal policy solution ...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
We propose a new method for learning policies for large, partially observable Markov decision proces...
We propose a new approach to the problem of searching a space of stochastic controllers for a Markov...
Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision...
Markov Decision Processes (MDP) are a mathematical formalism of many domains of artifical intelligen...