Safe Exploration and Optimization of Constrained MDPs Using Gaussian Processes

Wachi, Akifumi
Sui, Yanan
Yue, Yisong
Ono, Masahiro

Publication date

April 2018

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Abstract

We present a reinforcement learning approach to explore and optimize a safety-constrained Markov Decision Process(MDP). In this setting, the agent must maximize discounted cumulative reward while constraining the probability of entering unsafe states, defined using a safety function being within some tolerance. The safety values of all states are not known a priori, and we probabilistically model them via aGaussian Process (GP) prior. As such, properly behaving in such an environment requires balancing a three-way trade-off of exploring the safety function, exploring the reward function, and exploiting acquired knowledge to maximize reward. We propose a novel approach to balance this trade-off. Specifically, our approach explores unvisited ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Safe Exploration and Optimization of Constrained MDPs Using Gaussian Processes

Abstract

Extracted data

Safe Exploration and Optimization of Constrained MDPs Using Gaussian Processes

Abstract

Extracted data

Related items

Related items