We study the problem of long-run average cost control of Markov chains conditioned on a rare event. In a related recent work, a simulation based algorithm for estimating performance measures associated with a Markov chain conditioned on a rare event has been developed. We extend ideas from this work and develop an adaptive algorithm for obtaining, online, optimal control policies conditioned on a rare event. Our algorithm uses three timescales or step-size schedules. On the slowest timescale, a gradient search algorithm for policy updates that is based on one-simulation simultaneous perturbation stochastic approximation (SPSA) type estimates is used. Deterministic perturbation sequences obtained from appropriate normalized Hadamard matrices...
A control problem for a partially observable Markov chain depending on a parameter with long run ave...
Markov decision process (MDP) models are widely used for modeling sequential decision-making problem...
We present a general framework for applying simulation to optimize the behavior of discrete event sy...
We study the problem of long-run average cost control of Markov chains conditioned on a rare event. ...
We study the problem of long-run average cost control of Markov chains conditioned on a rare event. ...
We consider the problem of simulation-based estimation of performance measures for a Markov chain co...
A partially observed stochastic system is described by a discrete time pair of Markov processes. The...
100 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1985.In this thesis we consider th...
The ergodic or long-run average cost control problem for a partially observed finite-state Markov ch...
textIn this dissertation we study stochastic control problems for systems modelled by discrete-time...
In Chapter 2, we propose several two-timescale simulation-based actor-critic algorithms for solution...
Abstract. Three distinct controlled ergodic Markov models are considered here. The models are a disc...
The problem of controlling a Markov chain on a countable state space with ergodic or `long run avera...
The classical optimal control problems for discrete-time, transient Markov processes are infinite ho...
This is the published version, also available here: http://dx.doi.org/10.1137/S0363012996298369.Thre...
A control problem for a partially observable Markov chain depending on a parameter with long run ave...
Markov decision process (MDP) models are widely used for modeling sequential decision-making problem...
We present a general framework for applying simulation to optimize the behavior of discrete event sy...
We study the problem of long-run average cost control of Markov chains conditioned on a rare event. ...
We study the problem of long-run average cost control of Markov chains conditioned on a rare event. ...
We consider the problem of simulation-based estimation of performance measures for a Markov chain co...
A partially observed stochastic system is described by a discrete time pair of Markov processes. The...
100 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1985.In this thesis we consider th...
The ergodic or long-run average cost control problem for a partially observed finite-state Markov ch...
textIn this dissertation we study stochastic control problems for systems modelled by discrete-time...
In Chapter 2, we propose several two-timescale simulation-based actor-critic algorithms for solution...
Abstract. Three distinct controlled ergodic Markov models are considered here. The models are a disc...
The problem of controlling a Markov chain on a countable state space with ergodic or `long run avera...
The classical optimal control problems for discrete-time, transient Markov processes are infinite ho...
This is the published version, also available here: http://dx.doi.org/10.1137/S0363012996298369.Thre...
A control problem for a partially observable Markov chain depending on a parameter with long run ave...
Markov decision process (MDP) models are widely used for modeling sequential decision-making problem...
We present a general framework for applying simulation to optimize the behavior of discrete event sy...