We consider a finite state/action Markov Decision Process over the infinite time horizon, and with the limiting average reward criterion. However, we are interested not only in maximizing the above reward criterion but also in minimizing "the variability" of the stream of rewards. The latter notion is formalized in two alternative ways: one in terms of measuring absolute deviations from the "optimal" reward, and the other in terms of a "long-run variance" of a policy. In both cases we formulate a bi-objective optimization problem and show that efficient (i.e., "nondominated") deterministic stationary policies exist and can be computed by finite algorithms. In addition, in the former case we give an algorithm for computing a finite set of "c...
We consider finite horizon Markov decision processes under performance measures that involve both th...
AbstractThis paper studies the minimizing risk problems in Markov decision processes with countable ...
summary:The article is devoted to Markov reward chains in discrete-time setting with finite state sp...
We consider finite horizon Markov decision processes under performance measures that involve both th...
summary:The article is devoted to Markov reward chains in discrete-time setting with finite state sp...
summary:The article is devoted to Markov reward chains in discrete-time setting with finite state sp...
We study the optimization of average rewards of discrete time nonhomogeneous Markov chains, in which...
AbstractThis paper deals with a discrete time Markov decision model with a finite state space, arbit...
We study controller synthesis problems for finite-state Markov decision processes, where the objecti...
We study controller synthesis problems for finite-state Markov decision processes, where the objecti...
We consider the batch (off-line) policy learning problem in the infinite horizon Markov Decision Pro...
We study controller synthesis problems for finite-state Markov decision processes, where the objecti...
summary:In this note attention is focused on finding policies optimizing risk-sensitive optimality c...
summary:In this note attention is focused on finding policies optimizing risk-sensitive optimality c...
Abstract. We consider a discrete time, ®nite state Markov reward process that depends on a set of pa...
We consider finite horizon Markov decision processes under performance measures that involve both th...
AbstractThis paper studies the minimizing risk problems in Markov decision processes with countable ...
summary:The article is devoted to Markov reward chains in discrete-time setting with finite state sp...
We consider finite horizon Markov decision processes under performance measures that involve both th...
summary:The article is devoted to Markov reward chains in discrete-time setting with finite state sp...
summary:The article is devoted to Markov reward chains in discrete-time setting with finite state sp...
We study the optimization of average rewards of discrete time nonhomogeneous Markov chains, in which...
AbstractThis paper deals with a discrete time Markov decision model with a finite state space, arbit...
We study controller synthesis problems for finite-state Markov decision processes, where the objecti...
We study controller synthesis problems for finite-state Markov decision processes, where the objecti...
We consider the batch (off-line) policy learning problem in the infinite horizon Markov Decision Pro...
We study controller synthesis problems for finite-state Markov decision processes, where the objecti...
summary:In this note attention is focused on finding policies optimizing risk-sensitive optimality c...
summary:In this note attention is focused on finding policies optimizing risk-sensitive optimality c...
Abstract. We consider a discrete time, ®nite state Markov reward process that depends on a set of pa...
We consider finite horizon Markov decision processes under performance measures that involve both th...
AbstractThis paper studies the minimizing risk problems in Markov decision processes with countable ...
summary:The article is devoted to Markov reward chains in discrete-time setting with finite state sp...