We consider reinforcement learning algorithms in normal form games. Using two-timescales stochastic approximation, we introduce a model-free algorithm which is asymptotically equivalent to the smooth fictitious play algorithm, in that both result in asymptotic pseudotrajectories to the flow defined by the smooth best response dynamics. Both of these algorithms are shown to converge almost surely to Nash distribution in two-player zero-sum games and N -player partnership games. However, there are simple games for which these, and most other adaptive processes, fail to converge--in particular, we consider the N -player matching pennies game and Shapley's variant of the rock--scissors--paper game. By extending stochastic approximation results ...
Solving multi-agent reinforcement learning problems has proven difficult because of the lack of trac...
We present a new algorithm for polynomial time learning of optimal behavior in stochastic games. Thi...
Recent extensions to dynamic games (Leslie et al. [2020], Sayin et al. [2020], Baudin and Laraki [20...
Hirsch [2], is called smooth fictitious play. Using techniques from stochastic approximation by the ...
This thesis makes two extensions to the standard stochastic approximation framework in order to stud...
This paper proposes an extension of a popular decentralized discrete-time learning procedure when re...
28 pagesConsider a 2-player normal-form game repeated over time. We introduce an adaptive learning p...
Fudenberg and Kreps consider adaptive learning processes, in the spirit of fictitious play, for inf...
39 pages, 6 figures, 1 tableWe develop a unified stochastic approximation framework for analyzing th...
Recent extensions to dynamic games of the well-known fictitious play learning procedure in static ga...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
The single-agent multi-armed bandit problem can be solved by an agent that learns the values of each...
ADInternational audienceConsider a two-player normal-form game repeated over time. We introduce an a...
Fudenberg and Kreps (1993) consider adaptive learning processes, in the spirit of ctitious play, for...
Recent models of learning in games have attempted to produce individual-level learning algorithms th...
Solving multi-agent reinforcement learning problems has proven difficult because of the lack of trac...
We present a new algorithm for polynomial time learning of optimal behavior in stochastic games. Thi...
Recent extensions to dynamic games (Leslie et al. [2020], Sayin et al. [2020], Baudin and Laraki [20...
Hirsch [2], is called smooth fictitious play. Using techniques from stochastic approximation by the ...
This thesis makes two extensions to the standard stochastic approximation framework in order to stud...
This paper proposes an extension of a popular decentralized discrete-time learning procedure when re...
28 pagesConsider a 2-player normal-form game repeated over time. We introduce an adaptive learning p...
Fudenberg and Kreps consider adaptive learning processes, in the spirit of fictitious play, for inf...
39 pages, 6 figures, 1 tableWe develop a unified stochastic approximation framework for analyzing th...
Recent extensions to dynamic games of the well-known fictitious play learning procedure in static ga...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
The single-agent multi-armed bandit problem can be solved by an agent that learns the values of each...
ADInternational audienceConsider a two-player normal-form game repeated over time. We introduce an a...
Fudenberg and Kreps (1993) consider adaptive learning processes, in the spirit of ctitious play, for...
Recent models of learning in games have attempted to produce individual-level learning algorithms th...
Solving multi-agent reinforcement learning problems has proven difficult because of the lack of trac...
We present a new algorithm for polynomial time learning of optimal behavior in stochastic games. Thi...
Recent extensions to dynamic games (Leslie et al. [2020], Sayin et al. [2020], Baudin and Laraki [20...