Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning

Lecarpentier, Erwan
Rachelson, Emmanuel

Publication date

January 2019

Abstract

This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. 1) we define a specific class of MDPs that we call Non-Stationary MDPs (NSMDPs). We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t. time; 2) we consider a planning agent using the current model of the env...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning

Abstract

Extracted data

Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning

Abstract

Extracted data

Related items

Related items