In this paper, we address the problem of convergence to Nash equilibria in games with rewards that are initially unknown and must be estimated over time from noisy observations. These games arise in many real-world applications, whenever rewards for actions cannot be prespecified and must be learned online, but standard results in game theory do not consider such settings. For this problem, we derive a multiagent version of Q-learning to estimate the reward functions using novel forms of the epsilon-greedy learning policy. Using these Q-learning schemes to estimate reward functions, we then provide conditions guaranteeing the convergence of adaptive play and the better-reply processes to Nash equilibria in potential games and games with mor...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
The paper develops a framework for the analysis of finite n-player games, recurrently played by rand...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
Fudenberg and Kreps (1993) consider adaptive learning processes, in the spirit of ctitious play, for...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
In this paper, we provide a theoretical prediction of the way in which adaptive players behave in th...
Algorithmically designed reward functions can influence groups of learning agents toward measurable ...
Fudenberg and Kreps consider adaptive learning processes, in the spirit of fictitious play, for inf...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
This article investigates the performance of independent reinforcement learners in multi-agent games...
We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stoc...
We investigate multi-agent reinforcement learning for stochastic games with complex tasks, where the...
39 pages, 6 figures, 1 tableWe develop a unified stochastic approximation framework for analyzing th...
This paper investigates the problem of policy learning in multiagent environments using the stochast...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
The paper develops a framework for the analysis of finite n-player games, recurrently played by rand...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
Fudenberg and Kreps (1993) consider adaptive learning processes, in the spirit of ctitious play, for...
International audienceThis paper examines the equilibrium convergence properties of no-regret learni...
In this paper, we provide a theoretical prediction of the way in which adaptive players behave in th...
Algorithmically designed reward functions can influence groups of learning agents toward measurable ...
Fudenberg and Kreps consider adaptive learning processes, in the spirit of fictitious play, for inf...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
This article investigates the performance of independent reinforcement learners in multi-agent games...
We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stoc...
We investigate multi-agent reinforcement learning for stochastic games with complex tasks, where the...
39 pages, 6 figures, 1 tableWe develop a unified stochastic approximation framework for analyzing th...
This paper investigates the problem of policy learning in multiagent environments using the stochast...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
The paper develops a framework for the analysis of finite n-player games, recurrently played by rand...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...