Stable Opponent Shaping in Differentiable Games

Letcher, A
Foerster, J
Balduzzi, D
Rocktäschel, T
Whiteson, S

Publication date

May 2019

Publisher

ICLR

Abstract

A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel -- from GANs and intrinsic curiosity to multi-agent RL. Opponent shaping is a powerful approach to improve learning dynamics in these games, accounting for player influence on others' updates. Learning with Opponent-Learning Awareness (LOLA) is a recent algorithm that exploits this response and leads to cooperation in settings like the Iterated Prisoner's Dilemma. Although experimentally successful, we show that LOLA agents can exhibit 'arrogant' behaviour directly at odds with convergence. In fact, remarkably few algorithms have theoretical guarantees applying across all (n-player, non-convex) games....

Extracted data

We use cookies to provide a better user experience.

Data Protection

Stable Opponent Shaping in Differentiable Games

Abstract

Extracted data

Stable Opponent Shaping in Differentiable Games

Abstract

Extracted data

Related items

Related items