An ideal strategy in zero-sum games should not only grant the player an average reward no less than the value of Nash equilibrium, but also exploit the (adaptive) opponents when they are suboptimal. While most existing works in Markov games focus exclusively on the former objective, it remains open whether we can achieve both objectives simultaneously. To address this problem, this work studies no-regret learning in Markov games with adversarial opponents when competing against the best fixed policy in hindsight. Along this direction, we present a new complete set of positive and negative results: When the policies of the opponents are revealed at the end of each episode, we propose new efficient algorithms achieving $\sqrt{K}$-regret bou...
We present a new method for learning good strategies in zero-sum Markov games in which each side is...
This paper explores a fundamental connection between computational learning theory and game theory t...
We study what dataset assumption permits solving offline two-player zero-sum Markov games. In stark ...
In this paper, we study the learning problem in two-player general-sum Markov Games. We consider the...
International audienceIn game-theoretic learning, several agents are simultaneously following their ...
We present new results on the efficiency of no-regret al-gorithms in the context of multiagent learn...
We study decentralized policy learning in Markov games where we control a single agent to play with ...
We present new results on the efficiency of no-regret algorithms in the context of multiagent learni...
International audienceThe main contribution of this paper consists in extending several non-st...
International audienceWe examine the problem of regret minimization when the learner is involved in ...
Our work considers repeated games in which one player has a different objective than others. In part...
Many situations involve repeatedly making decisions in an uncertain environment: for instance, decid...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
Computing Nash equilibrium policies is a central problem in multi-agent reinforcement learning that ...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
We present a new method for learning good strategies in zero-sum Markov games in which each side is...
This paper explores a fundamental connection between computational learning theory and game theory t...
We study what dataset assumption permits solving offline two-player zero-sum Markov games. In stark ...
In this paper, we study the learning problem in two-player general-sum Markov Games. We consider the...
International audienceIn game-theoretic learning, several agents are simultaneously following their ...
We present new results on the efficiency of no-regret al-gorithms in the context of multiagent learn...
We study decentralized policy learning in Markov games where we control a single agent to play with ...
We present new results on the efficiency of no-regret algorithms in the context of multiagent learni...
International audienceThe main contribution of this paper consists in extending several non-st...
International audienceWe examine the problem of regret minimization when the learner is involved in ...
Our work considers repeated games in which one player has a different objective than others. In part...
Many situations involve repeatedly making decisions in an uncertain environment: for instance, decid...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
Computing Nash equilibrium policies is a central problem in multi-agent reinforcement learning that ...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
We present a new method for learning good strategies in zero-sum Markov games in which each side is...
This paper explores a fundamental connection between computational learning theory and game theory t...
We study what dataset assumption permits solving offline two-player zero-sum Markov games. In stark ...