Wedevelopasimulation-based,two-timescale actorcritic algorithm for infinite horizon Markov decision processes with finite state and action spaces, with a discounted reward criterion. The algorithm is of the gradient ascent type and performs a search in the space of stationary randomized policies. The algorithm uses certain simultaneous deterministic perturbation stochastic approximation (SDPSA) gradient estimates for enhanced performance. We show an application of our algorithm on a problem of mortgage refinancing. Our algorithm obtains the optimal refinancing strategies in a computationally efficient manner
We develop extensions of the Simulated Annealing with Multiplicative Weights (SAMW) algorithm that p...
In many sequential decision-making problems we may want to manage risk by minimizing some measure of...
Problems of sequential decision making under uncertainty are common inmanufacturing, computer and co...
We develop a simulation-based, two-timescale actor-critic algorithm for infinite horizon Markov deci...
A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov deci...
In Chapter 2, we propose several two-timescale simulation-based actor-critic algorithms for solution...
This article proposes several two-timescale simulation-based actor-critic algorithms for solution of...
Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probabili...
We consider the problem of control of hierarchical Markov decision processes and develop a simulatio...
We develop in this article the first actor-critic reinforcement learning algorithm with function app...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Mar...
We develop two new online actor-critic control algorithms with adaptive feature tuning for Markov De...
Problems involving optimal sequential making in uncertain dynamic systems arise in domains such as e...
Abstract Actor-critic algorithms are amongst the most well-studied reinforcement learning algorithms...
We develop extensions of the Simulated Annealing with Multiplicative Weights (SAMW) algorithm that p...
In many sequential decision-making problems we may want to manage risk by minimizing some measure of...
Problems of sequential decision making under uncertainty are common inmanufacturing, computer and co...
We develop a simulation-based, two-timescale actor-critic algorithm for infinite horizon Markov deci...
A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov deci...
In Chapter 2, we propose several two-timescale simulation-based actor-critic algorithms for solution...
This article proposes several two-timescale simulation-based actor-critic algorithms for solution of...
Due to their non-stationarity, finite-horizon Markov decision processes (FH-MDPs) have one probabili...
We consider the problem of control of hierarchical Markov decision processes and develop a simulatio...
We develop in this article the first actor-critic reinforcement learning algorithm with function app...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Mar...
We develop two new online actor-critic control algorithms with adaptive feature tuning for Markov De...
Problems involving optimal sequential making in uncertain dynamic systems arise in domains such as e...
Abstract Actor-critic algorithms are amongst the most well-studied reinforcement learning algorithms...
We develop extensions of the Simulated Annealing with Multiplicative Weights (SAMW) algorithm that p...
In many sequential decision-making problems we may want to manage risk by minimizing some measure of...
Problems of sequential decision making under uncertainty are common inmanufacturing, computer and co...