Markov decision processes (MDPs) may involve three types of delays. First, state information, rather than being available instantaneously, may arrive with a delay (observation delay). Second, an action may take effect at a later decision stage rather than immediately (action delay). Third, the cost induced by an action may be collected after a number of stages (cost delay). We de rive two results, one for constant and one for random delays, for reducing an MDP with delays to an MDP without delays, which differs only in the size of the state space. The results are based on the intuition that costs may be collected asynchronously, i.e., at a stage other than the one in which they are induced, as long as they are discounted properly
Gottinger HW. Markovian decision processes with limited state observability and unobservable costs. ...
We provide a tutorial on the construction and evalua-tion of Markov decision processes (MDPs), which...
We define the hitting time for a Markov decision process (MDP). We do not use the hitting time of th...
A short tutorial introduction is given to Markov decision processes (MDP), including the latest acti...
This paper applies two-phase time aggregation to solve discounted Markov decision processes (MDP). T...
Schedulers in randomly timed games can be classified as to whether they use timing information or no...
Markov Decision Problems (MDPs) are the foundation for many problems that are of interest to researc...
The standard Markov Decision Process (MDP) formulation hinges on the assumption that an action is ex...
We consider planning and learning in a class of sequential decision-making problems where state and ...
Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI r...
We consider a networked control system, where each subsystem evolves as a Markov decision process wi...
A Markov decision process (MDP) relies on the notions of state, describing the current situation of ...
The solution of Markov Decision Processes (MDPs) often relies on special properties of the processes...
The main focus of this thesis is Markovian decision processes with an emphasis on incorporating time...
Markov decision processes (MDPs) are models of dynamic decision making under uncertainty. These mode...
Gottinger HW. Markovian decision processes with limited state observability and unobservable costs. ...
We provide a tutorial on the construction and evalua-tion of Markov decision processes (MDPs), which...
We define the hitting time for a Markov decision process (MDP). We do not use the hitting time of th...
A short tutorial introduction is given to Markov decision processes (MDP), including the latest acti...
This paper applies two-phase time aggregation to solve discounted Markov decision processes (MDP). T...
Schedulers in randomly timed games can be classified as to whether they use timing information or no...
Markov Decision Problems (MDPs) are the foundation for many problems that are of interest to researc...
The standard Markov Decision Process (MDP) formulation hinges on the assumption that an action is ex...
We consider planning and learning in a class of sequential decision-making problems where state and ...
Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI r...
We consider a networked control system, where each subsystem evolves as a Markov decision process wi...
A Markov decision process (MDP) relies on the notions of state, describing the current situation of ...
The solution of Markov Decision Processes (MDPs) often relies on special properties of the processes...
The main focus of this thesis is Markovian decision processes with an emphasis on incorporating time...
Markov decision processes (MDPs) are models of dynamic decision making under uncertainty. These mode...
Gottinger HW. Markovian decision processes with limited state observability and unobservable costs. ...
We provide a tutorial on the construction and evalua-tion of Markov decision processes (MDPs), which...
We define the hitting time for a Markov decision process (MDP). We do not use the hitting time of th...