We consider a Markov decision process with both the expected limiting average, and the discounted total return criteria, appropriately modified to include a penalty for the variability in the stream of rewards. In both cases we formulate appropriate nonlinear programs in the space of state-action frequencies (averaged, or discounted) whose optimal solutions are shown to be related to the optimal policies in the corresponding “variance-penalized MDP.” The analysis of one of the discounted cases is facilitated by the introduction of a “Cartesian product of two independent MDPs.
In this paper we address a basic problem that arises naturally in average-reward Markov decision pro...
The article is devoted to second order optimality in Markov decision processes. Attention is primari...
We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of...
International audienceConsidering Markovian Decision Processes (MDPs), the meaning of an optimal pol...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
In control systems theory, the Markov decision process (MDP) is a widely used optimization model inv...
Abstract. Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with unce...
We develop the asymptotic variance for Markov decision processes. Results are provided to express th...
In many sequential decision-making problems we may want to manage risk by minimizing some measure of...
In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by...
In this paper we consider discounted Markov decision processes with finite state space and compact a...
summary:This paper deals with a first passage mean-variance problem for semi-Markov decision process...
AbstractWe consider the optimization of the variance of the sum of costs as well as that of an avera...
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
In this paper we address a basic problem that arises naturally in average-reward Markov decision pro...
The article is devoted to second order optimality in Markov decision processes. Attention is primari...
We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of...
International audienceConsidering Markovian Decision Processes (MDPs), the meaning of an optimal pol...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
In control systems theory, the Markov decision process (MDP) is a widely used optimization model inv...
Abstract. Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with unce...
We develop the asymptotic variance for Markov decision processes. Results are provided to express th...
In many sequential decision-making problems we may want to manage risk by minimizing some measure of...
In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by...
In this paper we consider discounted Markov decision processes with finite state space and compact a...
summary:This paper deals with a first passage mean-variance problem for semi-Markov decision process...
AbstractWe consider the optimization of the variance of the sum of costs as well as that of an avera...
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
In this paper we address a basic problem that arises naturally in average-reward Markov decision pro...
The article is devoted to second order optimality in Markov decision processes. Attention is primari...
We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of...