Markov Decision Processes

Garcia, Frédérick
Rachelson, Emmanuel

Publication date

January 2010

DOI

10.1002/9781118557426.ch1

Publisher

Wiley

Abstract

A Markov decision process (MDP) relies on the notions of state, describing the current situation of the agent, action affecting the dynamics of the process, and reward, observed for each transition between states. This chapter presents the basics of MDP theory and optimization, in the case of an agent having a perfect knowledge of the decision process and of its state at every time step, when the agent’s goal is to maximize its global revenue over time. Solving a Markov decision problem implies searching for a policy, in a given set, which optimizes a performance criterion for the considered MDP. The main criteria studied in the theory of MDPs are: finite criterion, discounted criterion, total reward criterion and average criterion. The cha...