Approximate policy iteration (API) is studied to solve undiscounted optimal control problems in this paper. A discrete-time system with the continuous-state space and the finite-action set is considered. As approximation technique is used for the continuous-state space, approximation errors exist in the calculation and disturb the convergence of the original policy iteration. In our research, we analyze and prove the convergence of API for undiscounted optimal control. We use an iterative method to implement approximate policy evaluation and demonstrate that the error between approximate and exact value functions is bounded. Then, with the finite-action set, the greedy policy in policy improvement is generated directly. Our main theorem pro...
ISSN 0819-2642 ISBN 0 7340 2618 8 Research Paper Number 961This paper studies fitted value iteration...
We consider the infinite-horizon discounted opti-mal control problem formalized by Markov De-cision ...
We consider the infinite-horizon discounted opti-mal control problem formalized by Markov De-cision ...
We consider the discrete-time infinite-horizon optimal control problem formalized by Markov de-cisio...
International audienceWe present a new algorithm called policy iteration plus (PI +) for the optimal...
We study a new, model-free form of approximate policy iteration which uses Sarsa updates with linear...
Convergence of the policy iteration method for discrete and continuous optimal control problems hold...
We consider approximate dynamic programming for the infinite-horizon stationary γ-discounted optimal...
We present a numerical method for generating the state-feedback control policy associated with gener...
This paper studies fitted value iteration for continuous state dynamic programming using nonexpansiv...
We consider the problem of learning discounted-cost optimal control policies for unknown determinist...
Most of the current theory for dynamic programming algorithms focuses on finite state, finite action...
This paper studies fitted value iteration for continuous state numerical dynamic programming using n...
International audienceWe consider the discrete-time infinite-horizon optimal control problem formali...
We consider the infinite-horizon γ-discounted optimal control problem formalized by Markov Decision ...
ISSN 0819-2642 ISBN 0 7340 2618 8 Research Paper Number 961This paper studies fitted value iteration...
We consider the infinite-horizon discounted opti-mal control problem formalized by Markov De-cision ...
We consider the infinite-horizon discounted opti-mal control problem formalized by Markov De-cision ...
We consider the discrete-time infinite-horizon optimal control problem formalized by Markov de-cisio...
International audienceWe present a new algorithm called policy iteration plus (PI +) for the optimal...
We study a new, model-free form of approximate policy iteration which uses Sarsa updates with linear...
Convergence of the policy iteration method for discrete and continuous optimal control problems hold...
We consider approximate dynamic programming for the infinite-horizon stationary γ-discounted optimal...
We present a numerical method for generating the state-feedback control policy associated with gener...
This paper studies fitted value iteration for continuous state dynamic programming using nonexpansiv...
We consider the problem of learning discounted-cost optimal control policies for unknown determinist...
Most of the current theory for dynamic programming algorithms focuses on finite state, finite action...
This paper studies fitted value iteration for continuous state numerical dynamic programming using n...
International audienceWe consider the discrete-time infinite-horizon optimal control problem formali...
We consider the infinite-horizon γ-discounted optimal control problem formalized by Markov Decision ...
ISSN 0819-2642 ISBN 0 7340 2618 8 Research Paper Number 961This paper studies fitted value iteration...
We consider the infinite-horizon discounted opti-mal control problem formalized by Markov De-cision ...
We consider the infinite-horizon discounted opti-mal control problem formalized by Markov De-cision ...