Admissible policy teaching through reward design

Banihashem, K
Singla, A
Gan, J
Radanovic, G

Publication date

January 2022

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Abstract

We study reward design strategies for incentivizing a reinforcement learning agent to adopt a policy from a set of admissible policies. The goal of the reward designer is to modify the underlying reward function cost-efficiently while ensuring that any approximately optimal deterministic policy under the new reward function is admissible and performs well under the original reward function. This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states. Perhaps surprisingly, and in contrast to the problem of optimal reward poisoning attacks, we first show ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Admissible policy teaching through reward design

Abstract

Extracted data

Admissible policy teaching through reward design

Abstract

Extracted data

Related items

Related items