Problems involving optimal sequential making in uncertain dynamic systems arise in domains such as engineering, science and economics. Such problems can often be cast in the framework of Markov Decision Process (MDP). Solving an MDP requires computing the optimal value function and the optimal policy. The idea of dynamic programming (DP) and the Bellman equation (BE) are at the heart of solution methods. The three important exact DP methods are value iteration, policy iteration and linear programming. The exact DP methods compute the optimal value function and the optimal policy. However, the exact DP methods are inadequate in practice because the state space is often large and in practice, one might have to resort to approximate methods th...
We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogen...
We consider approximate dynamic programming for the infinite-horizon stationary γ-discounted optimal...
This brief studies the stochastic optimal control problem via reinforcement learning and approximate...
Sequential decision making under uncertainty is at the heart of a wide variety of practical problems...
AbstractQ-Learning is based on value iteration and remains the most popular choice for solving Marko...
Markov decision processes (MDPs) with large number of states are of high practical interest. However...
The framework of dynamic programming (DP) and reinforcement learning (RL) can be used to express imp...
Thesis (Ph.D.)--University of Washington, 2018A broad range of optimization problems in applications...
Computing the exact solution of an MDP model is generally difficult and possibly intractable for rea...
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic (DP...
We consider large-scale Markov decision processes (MDPs) with parameter un-certainty, under the robu...
2014-10-14This dissertation addresses some problems in the area of learning, optimization and decisi...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Solving Markov decision processes (MDPs) efficiently is challenging in many cases, for example, when...
Reinforcement learning algorithms hold promise in many complex domains, such as resource management ...
We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogen...
We consider approximate dynamic programming for the infinite-horizon stationary γ-discounted optimal...
This brief studies the stochastic optimal control problem via reinforcement learning and approximate...
Sequential decision making under uncertainty is at the heart of a wide variety of practical problems...
AbstractQ-Learning is based on value iteration and remains the most popular choice for solving Marko...
Markov decision processes (MDPs) with large number of states are of high practical interest. However...
The framework of dynamic programming (DP) and reinforcement learning (RL) can be used to express imp...
Thesis (Ph.D.)--University of Washington, 2018A broad range of optimization problems in applications...
Computing the exact solution of an MDP model is generally difficult and possibly intractable for rea...
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic (DP...
We consider large-scale Markov decision processes (MDPs) with parameter un-certainty, under the robu...
2014-10-14This dissertation addresses some problems in the area of learning, optimization and decisi...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Solving Markov decision processes (MDPs) efficiently is challenging in many cases, for example, when...
Reinforcement learning algorithms hold promise in many complex domains, such as resource management ...
We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogen...
We consider approximate dynamic programming for the infinite-horizon stationary γ-discounted optimal...
This brief studies the stochastic optimal control problem via reinforcement learning and approximate...