International audienceIn this paper, we examine the Nash equilibrium convergence properties of no-regret learning in general N-player games. For concreteness, we focus on the archetypal "follow the regularized leader" (FTRL) family of algorithms, and we consider the full spectrum of uncertainty that the players may encounter-from noisy, oracle-based feedback, to bandit, payoff-based information. In this general context, we establish a comprehensive equivalence between the stability of a Nash equilibrium and its support: a Nash equilibrium is stable and attracting with arbitrarily high probability if and only if it is strict (i.e., each equilibrium strategy has a unique best response). This equivalence extends existing continuous-time versio...
International audienceMotivated by the scarcity of accurate payoff feedback in practical application...
34 pages, 6 figuresInternational audienceWe investigate a class of reinforcement learning dynamics i...
We show that the logit-response dynamics converges to a subset of (strict) Nash equilibria for any w...
International audienceIn this paper, we examine the Nash equilibrium convergence properties of no-re...
International audienceIn this paper, we examine the convergence rate of a wide range of regularized ...
International audienceIn this paper, we examine the convergence rate of a wide range of regularized ...
International audienceIn game-theoretic learning, several agents are simultaneously following their ...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceUnderstanding the behavior of no-regret dynamics in general N-player games is ...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceMotivated by the scarcity of accurate payoff feedback in practical application...
34 pages, 6 figuresInternational audienceWe investigate a class of reinforcement learning dynamics i...
We show that the logit-response dynamics converges to a subset of (strict) Nash equilibria for any w...
International audienceIn this paper, we examine the Nash equilibrium convergence properties of no-re...
International audienceIn this paper, we examine the convergence rate of a wide range of regularized ...
International audienceIn this paper, we examine the convergence rate of a wide range of regularized ...
International audienceIn game-theoretic learning, several agents are simultaneously following their ...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceUnderstanding the behavior of no-regret dynamics in general N-player games is ...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
International audienceMotivated by the scarcity of accurate payoff feedback in practical application...
34 pages, 6 figuresInternational audienceWe investigate a class of reinforcement learning dynamics i...
We show that the logit-response dynamics converges to a subset of (strict) Nash equilibria for any w...