Entropic Risk Measure in Policy Search

Nass, David
Belousov, Boris
Peters, Jan

Open PDF

Open link

Publication date

January 2022

DOI

10.26083/tuprints-00020551

Publisher

IEEE

Abstract

With the increasing pace of automation, modern robotic systems need to act in stochastic, non-stationary, partially observable environments. A range of algorithms for finding parameterized policies that optimize for long-term average performance have been proposed in the past. However, the majority of the proposed approaches does not explicitly take into account the variability of the performance metric, which may lead to finding policies that although performing well on average, can perform spectacularly bad in a particular run or over a period of time. To address this shortcoming, we study an approach to policy optimization that explicitly takes into account higher order statistics of the reward function. In this paper, we extend policy g...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Entropic Risk Measure in Policy Search

Abstract

Extracted data

Entropic Risk Measure in Policy Search

Abstract

Extracted data

Related items

Related items