Reachability in MDPs: Refining Convergence of Value Iteration?

Serge Haddad
Benjamin Monmege
Serge Haddad
Benjamin Monmege

Publication date

January 2014

Abstract

Abstract. Markov Decision Processes (MDP) are a widely used model including both non-deterministic and probabilistic choices. Minimal and maximal probabilities to reach a target set of states, with respect to a policy resolving non-determinism, may be computed by several methods including value iteration. This algorithm, easy to implement and efficient in terms of space complexity, consists in iteratively finding the probabil-ities of paths of increasing length. However, it raises three issues: (1) defining a stopping criterion ensuring a bound on the approximation, (2) analyzing the rate of convergence, and (3) specifying an additional procedure to obtain the exact values once a sufficient number of iter-ations has been performed. The firs...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Reachability in MDPs: Refining Convergence of Value Iteration?

Abstract

Extracted data

Reachability in MDPs: Refining Convergence of Value Iteration?

Abstract

Extracted data

Related items

Related items