AbstractThe following optimality principle is established for finite undiscounted or discounted Markov decision processes: If a policy is (gain, bias, or discounted) optimal in one state, it is also optimal for all states reachable from this state using this policy. The optimality principle is used constructively to demonstrate the existence of a policy that is optimal in every state, and then to derive the coupled functional equations satisfied by the optimal return vectors. This reverses the usual sequence, where one first establishes (via policy iteration or linear programming) the solvability of the coupled functional equations, and then shows that the solution is indeed the optimal return vector and that the maximizing policy for the f...
We consider a discrete time Markov Decision Process with infinite horizon. The criterion to be maxim...
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in...
We consider partially observable Markov decision processes with finite or countably infinite (core) ...
AbstractThe following optimality principle is established for finite undiscounted or discounted Mark...
A Markov decision process (MDP) relies on the notions of state, describing the current situation of ...
Optimality criteria for Markov decision processes have historically been based on a risk neutral for...
This paper considers Markov decision processes (MDPs) with unbounded rates, as a function of state. ...
This paper presents an axiomatic approach to finite Markov decision processes where the discount rat...
We consider multistage decision processes where criterion function is an expectation of minimum func...
AbstractFor a vector-valued Markov decision process, we characterize optimal (deterministic) station...
summary:In this note we focus attention on identifying optimal policies and on elimination suboptima...
This paper presents a formulation of an optimality principle for a new class of concurrent decision ...
We consider a discrete time Markov reward process with finite state and action spaces and random ret...
summary:Many examples in optimization, ranging from Linear Programming to Markov Decision Processes ...
Abstract. Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with unce...
We consider a discrete time Markov Decision Process with infinite horizon. The criterion to be maxim...
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in...
We consider partially observable Markov decision processes with finite or countably infinite (core) ...
AbstractThe following optimality principle is established for finite undiscounted or discounted Mark...
A Markov decision process (MDP) relies on the notions of state, describing the current situation of ...
Optimality criteria for Markov decision processes have historically been based on a risk neutral for...
This paper considers Markov decision processes (MDPs) with unbounded rates, as a function of state. ...
This paper presents an axiomatic approach to finite Markov decision processes where the discount rat...
We consider multistage decision processes where criterion function is an expectation of minimum func...
AbstractFor a vector-valued Markov decision process, we characterize optimal (deterministic) station...
summary:In this note we focus attention on identifying optimal policies and on elimination suboptima...
This paper presents a formulation of an optimality principle for a new class of concurrent decision ...
We consider a discrete time Markov reward process with finite state and action spaces and random ret...
summary:Many examples in optimization, ranging from Linear Programming to Markov Decision Processes ...
Abstract. Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with unce...
We consider a discrete time Markov Decision Process with infinite horizon. The criterion to be maxim...
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in...
We consider partially observable Markov decision processes with finite or countably infinite (core) ...