This dissertation applies policy improvement and successive approximation or value iteration to a general class of Markov decision processes with discounted costs. In particular, a class of Markov decision processes, called piecewise-linear, is studied. Piecewise-linear processes are characterized by the property that the value function of a process observed for one period and then terminated is piecewise-linear if the terminal reward function is piecewise-linear. Partially observable Markov decision processes have this property. It is shown that there are e-optimal piecewise-linear value functions and piecewise-constant policies which are simple. Simple means that there are only finitely many pieces, each of which is defined on a convex ...
AbstractFor a vector-valued Markov decision process, we characterize optimal (deterministic) station...
summary:In this paper there are considered Markov decision processes (MDPs) that have the discounted...
summary:In this paper there are considered Markov decision processes (MDPs) that have the discounted...
AbstractThis paper considers how partially observable Markov decision processes may be transformed i...
AbstractThis paper considers how partially observable Markov decision processes may be transformed i...
This letter investigates the structure of the optimal policy for a class of Markov decision processe...
This letter investigates the structure of the optimal policy for a class of Markov decision processe...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
AbstractFor a vector-valued Markov decision process with discounted reward criterion, we introduce a...
The thesis develops methods to solve discrete-time finite-state partially observable Markov decision...
AbstractFor a vector-valued Markov decision process, we characterize optimal (deterministic) station...
summary:In this paper there are considered Markov decision processes (MDPs) that have the discounted...
summary:In this paper there are considered Markov decision processes (MDPs) that have the discounted...
AbstractThis paper considers how partially observable Markov decision processes may be transformed i...
AbstractThis paper considers how partially observable Markov decision processes may be transformed i...
This letter investigates the structure of the optimal policy for a class of Markov decision processe...
This letter investigates the structure of the optimal policy for a class of Markov decision processe...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
For semi-Markov decision processes with discounted rewards we derive the well known results regardin...
AbstractFor a vector-valued Markov decision process with discounted reward criterion, we introduce a...
The thesis develops methods to solve discrete-time finite-state partially observable Markov decision...
AbstractFor a vector-valued Markov decision process, we characterize optimal (deterministic) station...
summary:In this paper there are considered Markov decision processes (MDPs) that have the discounted...
summary:In this paper there are considered Markov decision processes (MDPs) that have the discounted...