Solving Nash equilibrium is the key challenge in normal-form games with large strategy spaces, wherein open-ended learning framework provides an efficient approach. Previous studies invariably employ diversity as a conduit to foster the advancement of strategies. Nevertheless, diversity-based algorithms can only work in zero-sum games with cyclic dimensions, which lead to limitations in their applicability. Here, we propose an innovative unified open-ended learning framework SC-PSRO, i.e., Self-Confirming Policy Space Response Oracle, as a general framework for both zero-sum and general-sum games. In particular, we introduce the advantage function as an improved evaluation metric for strategies, allowing for a unified learning objective for...
AbstractWe introduce efficient learning equilibrium (ELE), a normative approach to learning in non-c...
Model-free learning for multi-agent stochastic games is an active area of research. Existing reinfor...
An individual’s learning rule is completely uncoupled if it does not depend directly on the actions ...
Policy Space Response Oracle methods (PSRO) provide a general solution to learn Nash equilibrium in ...
International audienceThis paper addresses the problem of learning a Nash equilibrium in γ-discounte...
In this thesis, we explore the use of policy approximation for reducing the computational cost of le...
Multi-agent reinforcement learning (MARL) has become effective in tackling discrete cooperative game...
PhDI gratefully acknowledge that my PhD studies were funded by EPSRC doctoral training grant EP/K502...
When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often c...
National audienceWe consider the problem of learning strategy selection in games. The theoretical so...
Promoting behavioural diversity is critical for solving games with non-transitive dynamics where str...
Copyright © 2015, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas...
International audienceFictitious play is a game theoretic iterative procedure meant to learn an equi...
Recent advances in multiagent learning have seen the introduction ofa family of algorithms that revo...
Solving strategic games with huge action spaces is a critical yet under-explored topic in economics...
AbstractWe introduce efficient learning equilibrium (ELE), a normative approach to learning in non-c...
Model-free learning for multi-agent stochastic games is an active area of research. Existing reinfor...
An individual’s learning rule is completely uncoupled if it does not depend directly on the actions ...
Policy Space Response Oracle methods (PSRO) provide a general solution to learn Nash equilibrium in ...
International audienceThis paper addresses the problem of learning a Nash equilibrium in γ-discounte...
In this thesis, we explore the use of policy approximation for reducing the computational cost of le...
Multi-agent reinforcement learning (MARL) has become effective in tackling discrete cooperative game...
PhDI gratefully acknowledge that my PhD studies were funded by EPSRC doctoral training grant EP/K502...
When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often c...
National audienceWe consider the problem of learning strategy selection in games. The theoretical so...
Promoting behavioural diversity is critical for solving games with non-transitive dynamics where str...
Copyright © 2015, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas...
International audienceFictitious play is a game theoretic iterative procedure meant to learn an equi...
Recent advances in multiagent learning have seen the introduction ofa family of algorithms that revo...
Solving strategic games with huge action spaces is a critical yet under-explored topic in economics...
AbstractWe introduce efficient learning equilibrium (ELE), a normative approach to learning in non-c...
Model-free learning for multi-agent stochastic games is an active area of research. Existing reinfor...
An individual’s learning rule is completely uncoupled if it does not depend directly on the actions ...