A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional markov decision process with continuous state and action spaces

Approximate Policy Iteration for Markov Control Revisited

Gosavi, Abhijit

December 2012

AbstractQ-Learning is based on value iteration and remains the most popular choice for solving Marko...

Improved bound on the worst case complexity of Policy Iteration

Hollanders, Romain
Gerencser, Balazs
Delvenne, Jean-Charles
Jungers, Raphaël M.

January 2016

Solving Markov Decision Processes is a recurrent task in engineering which can be performed efficien...

Approximate receding horizon approach for Markov decision processes: average reward case

Chang, Hyeong Soo
Marcus, Steven I.

October 2003

AbstractWe consider an approximation scheme for solving Markov decision processes (MDPs) with counta...

2008), “Convergence Proofs of Least Squares Policy Iteration Algorithm for High-Dimensional Infinite Horizon Markov Decision Process Problems,” working paper

Jun Ma
Warren B. Powell

August 2015

Most of the current theory for dynamic programming algorithms focuses on finite state, finite action...

Incremental least squares policy iteration for POMDPs

Hui Li
Xuejun Liao
Lawrence Carin

January 2015

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding t...

Modified Policy Iteration Algorithms for Discounted Markov Decision Problems

Martin L. Puterman
Moon Chirl Shin

In this paper we study a class of modified policy iteration algorithms for solving Markov decision p...

Convergence of Simulation-Based Policy Iteration

William Cooper
Shane Henderson
Mark Lewis

January 2003

Simulation-based policy iteration (SBPI) is a modification of the policy iteration algorithm for com...

A convergent form of approximate policy iteration

Theodore J. Perkins
Doina Precup

January 2003

We study a new, model-free form of approximate policy iteration which uses Sarsa updates with linear...

Batch Policy Iteration Algorithms for Continuous Domains

Piot, Bilal
Geist, Matthieu
Pietquin, Olivier

January 2016

International audienceThis paper establishes the link between an adaptation of the policy iteration ...

Approximate policy iteration for Markov decision processes via quantitative adaptive aggregations

Abate, A
Češka, M
Kwiatkowska, M

January 2016

We consider the problem of finding an optimal policy in a Markov decision process that maximises the...

Acta Cybernetica 00 (0000) 1–21. Factored Value Iteration Converges

January 2016

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solu...

Least-squares methods for policy iteration

Ro Lazaric
Mohammad Ghavamzadeh
Bart De Schutter
Ro Lazaric
Mohammad Ghavamzadeh
Bart De Schutter

January 2012

Abstract Approximate reinforcement learning deals with the essential problem of applying reinforceme...

Approximate policy iteration with a policy language bias

Alan Fern
Sungwook Yoon
Robert Givan

January 2003

We explore approximate policy iteration, replacing the usual costfunction learning step with a learn...

Exploration in Least-Squares Policy Iteration

October 2008

One of the key problems in reinforcement learning is balancing exploration and exploitation. Another...

A policy iteration algorithm for Markov decision processes skip-free in one direction

J. Lambert
B. Van Houdt
C. Blondia

January 2007

Markov decision processes (MDP) [1] provide a mathe-matical framework for studying a wide range of o...

Approximate Policy Iteration for Markov Control Revisited

Gosavi, Abhijit

December 2012

AbstractQ-Learning is based on value iteration and remains the most popular choice for solving Marko...

Improved bound on the worst case complexity of Policy Iteration

Hollanders, Romain
Gerencser, Balazs
Delvenne, Jean-Charles
Jungers, Raphaël M.

January 2016

Solving Markov Decision Processes is a recurrent task in engineering which can be performed efficien...

Approximate receding horizon approach for Markov decision processes: average reward case

Chang, Hyeong Soo
Marcus, Steven I.

October 2003

AbstractWe consider an approximation scheme for solving Markov decision processes (MDPs) with counta...

2008), “Convergence Proofs of Least Squares Policy Iteration Algorithm for High-Dimensional Infinite Horizon Markov Decision Process Problems,” working paper

Jun Ma
Warren B. Powell

August 2015

Most of the current theory for dynamic programming algorithms focuses on finite state, finite action...

Incremental least squares policy iteration for POMDPs

Hui Li
Xuejun Liao
Lawrence Carin

January 2015

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding t...

Modified Policy Iteration Algorithms for Discounted Markov Decision Problems

Martin L. Puterman
Moon Chirl Shin

In this paper we study a class of modified policy iteration algorithms for solving Markov decision p...

Convergence of Simulation-Based Policy Iteration

William Cooper
Shane Henderson
Mark Lewis

January 2003

Simulation-based policy iteration (SBPI) is a modification of the policy iteration algorithm for com...

A convergent form of approximate policy iteration

Theodore J. Perkins
Doina Precup

January 2003

We study a new, model-free form of approximate policy iteration which uses Sarsa updates with linear...

Batch Policy Iteration Algorithms for Continuous Domains

Piot, Bilal
Geist, Matthieu
Pietquin, Olivier

January 2016

International audienceThis paper establishes the link between an adaptation of the policy iteration ...

Approximate policy iteration for Markov decision processes via quantitative adaptive aggregations

Abate, A
Češka, M
Kwiatkowska, M

January 2016

We consider the problem of finding an optimal policy in a Markov decision process that maximises the...

Acta Cybernetica 00 (0000) 1–21. Factored Value Iteration Converges

January 2016

In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solu...

Least-squares methods for policy iteration

Ro Lazaric
Mohammad Ghavamzadeh
Bart De Schutter
Ro Lazaric
Mohammad Ghavamzadeh
Bart De Schutter

January 2012

Abstract Approximate reinforcement learning deals with the essential problem of applying reinforceme...

Approximate policy iteration with a policy language bias

Alan Fern
Sungwook Yoon
Robert Givan

January 2003

We explore approximate policy iteration, replacing the usual costfunction learning step with a learn...

Exploration in Least-Squares Policy Iteration

October 2008

One of the key problems in reinforcement learning is balancing exploration and exploitation. Another...

A policy iteration algorithm for Markov decision processes skip-free in one direction

J. Lambert
B. Van Houdt
C. Blondia

January 2007

Markov decision processes (MDP) [1] provide a mathe-matical framework for studying a wide range of o...

Approximate Policy Iteration for Markov Control Revisited

Gosavi, Abhijit

December 2012

AbstractQ-Learning is based on value iteration and remains the most popular choice for solving Marko...

Improved bound on the worst case complexity of Policy Iteration

Hollanders, Romain
Gerencser, Balazs
Delvenne, Jean-Charles
Jungers, Raphaël M.

January 2016

Solving Markov Decision Processes is a recurrent task in engineering which can be performed efficien...

Approximate receding horizon approach for Markov decision processes: average reward case

Chang, Hyeong Soo
Marcus, Steven I.

October 2003

AbstractWe consider an approximation scheme for solving Markov decision processes (MDPs) with counta...

A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional markov decision process with continuous state and action spaces

Abstract

Extracted data

A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional markov decision process with continuous state and action spaces

Abstract

Extracted data

Related items

Related items