A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel – from GANs and intrinsic curiosity to multi-agent RL. Opponent shaping is a powerful approach to improve learning dynamics in these games, accounting for player influence on others’ updates. Learning with Opponent-Learning Awareness (LOLA) is a recent algorithm that exploits this response and leads to cooperation in settings like the Iterated Prisoner’s Dilemma. Although experimentally successful, we show that LOLA agents can exhibit ‘arrogant’ behaviour directly at odds with convergence. In fact, remarkably few algorithms have theoretical guarantees applying across all (n-player, non-convex) games. ...
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-...
In this paper (reinforcement) learning of decision makers that face many different games is studied....
When an opponent with a stationary and stochastic policy is encountered in a two-player competitive ...
A growing number of learning methods are actually differentiable games whose players optimise multip...
Multi-agent settings are quickly gathering importance in machine learning. This includes a plethora ...
Learning in general-sum games is unstable and frequently leads to socially undesirable (Pareto-domin...
Interactions in multiagent systems are generally more complicated than single agent ones. Game theor...
International audienceIn game-theoretic learning, several agents are simultaneously following their ...
International audienceIn this paper, we examine the Nash equilibrium convergence properties of no-re...
Interactions in multiagent systems are generally more com- plicated than single agent ones. Game th...
Our work considers repeated games in which one player has a different objective than others. In part...
This paper introduces a novel payoff-based learning scheme for distributed optimization in repeatedl...
Reinforcement Learning (RL) formalises a problem where an intelligent agent needs to learn and achie...
Machine Learning has recently made significant advances in challenges such as speech and image recog...
Multi-agent learning is a growing area of research. An important topic is to formulate how an agent ...
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-...
In this paper (reinforcement) learning of decision makers that face many different games is studied....
When an opponent with a stationary and stochastic policy is encountered in a two-player competitive ...
A growing number of learning methods are actually differentiable games whose players optimise multip...
Multi-agent settings are quickly gathering importance in machine learning. This includes a plethora ...
Learning in general-sum games is unstable and frequently leads to socially undesirable (Pareto-domin...
Interactions in multiagent systems are generally more complicated than single agent ones. Game theor...
International audienceIn game-theoretic learning, several agents are simultaneously following their ...
International audienceIn this paper, we examine the Nash equilibrium convergence properties of no-re...
Interactions in multiagent systems are generally more com- plicated than single agent ones. Game th...
Our work considers repeated games in which one player has a different objective than others. In part...
This paper introduces a novel payoff-based learning scheme for distributed optimization in repeatedl...
Reinforcement Learning (RL) formalises a problem where an intelligent agent needs to learn and achie...
Machine Learning has recently made significant advances in challenges such as speech and image recog...
Multi-agent learning is a growing area of research. An important topic is to formulate how an agent ...
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-...
In this paper (reinforcement) learning of decision makers that face many different games is studied....
When an opponent with a stationary and stochastic policy is encountered in a two-player competitive ...