Abstract. Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in the parameters of a Markov Decision Process (MDP). Unlike the case of an MDP, the notion of an optimal policy for a BMDP is not entirely straightforward. We consider two notions of optimality based on optimistic and pessimistic criteria. These have been analyzed for discounted BMDPs. Here we pro-vide results for average reward BMDPs. We establish a fundamental relationship between the discounted and the average reward problems, prove the existence of Blackwell optimal policies and, for both notions of optimality, derive algorithms that con-verge to the optimal value function.
What are the functionals of the reward that can be computed and optimized exactly in Markov Decision...
We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of...
In this paper we address the following basic feasibility problem for infinite-horizon Markov decisio...
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in...
AbstractIn this paper, we introduce the notion of a bounded-parameter Markov decision process (BMDP)...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by...
AbstractIn the present paper the expected average reward criterion is considered instead of the aver...
A Markov decision process (MDP) relies on the notions of state, describing the current situation of ...
We consider multistage decision processes where criterion function is an expectation of minimum func...
AbstractThis paper deals with the average expected reward criterion for continuous-time Markov decis...
International audienceConsidering Markovian Decision Processes (MDPs), the meaning of an optimal pol...
The precise specification of reward functions for Markov decision processes (MDPs) is often extremel...
The running time of the classical algorithms of the Markov Decision Process (MDP) typically grows li...
What are the functionals of the reward that can be computed and optimized exactly in Markov Decision...
We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of...
In this paper we address the following basic feasibility problem for infinite-horizon Markov decisio...
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in...
AbstractIn this paper, we introduce the notion of a bounded-parameter Markov decision process (BMDP)...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by...
AbstractIn the present paper the expected average reward criterion is considered instead of the aver...
A Markov decision process (MDP) relies on the notions of state, describing the current situation of ...
We consider multistage decision processes where criterion function is an expectation of minimum func...
AbstractThis paper deals with the average expected reward criterion for continuous-time Markov decis...
International audienceConsidering Markovian Decision Processes (MDPs), the meaning of an optimal pol...
The precise specification of reward functions for Markov decision processes (MDPs) is often extremel...
The running time of the classical algorithms of the Markov Decision Process (MDP) typically grows li...
What are the functionals of the reward that can be computed and optimized exactly in Markov Decision...
We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of...
In this paper we address the following basic feasibility problem for infinite-horizon Markov decisio...