In this paper, we address multi-agent decision problems where all agents share a common goal. This class of problems is suitably modeled using finite-state Markov games with identical interests. We tackle the problem of coordination and contribute a new algorithm, coordinated Q-learning (CQL). CQL combines Q-learning with biased adaptive play, a coordination mechanism based on the principle of fictitious-play. We analyze how the two methods can be combined without compromising the convergence of either. We illustrate the performance of CQL in several different environments and discuss several properties of this algorithm. Recent years have witnessed increasing interest in extending reinforcement learning (RL) to multi-agent problems. Howeve...
We report on an investigation of reinforcement learning tech-niques for the learning of coordination...
We report on an investigation of reinforcement learning techniques for the learning of coordination ...
This paper investigates the problem of policy learning in multiagent environments using the stochast...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
In this paper we address the problem of coordination in multi-agent sequential decision problems wit...
This article investigates the performance of independent reinforcement learners in multi-agent games...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-...
Being able to accomplish tasks with multiple learners through learning has long been a goal of the m...
Learning behaviors in a multiagent environment is crucial for developing and adapting multiagent sys...
Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate ...
In this thesis, we first suggest a new type of Markov model extended by Watkins’ action replay proce...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
On-line learning methods have been applied successfully in multi-agent systems to achieve coordinati...
We report on an investigation of reinforcement learning tech-niques for the learning of coordination...
We report on an investigation of reinforcement learning techniques for the learning of coordination ...
This paper investigates the problem of policy learning in multiagent environments using the stochast...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
In this paper we address the problem of coordination in multi-agent sequential decision problems wit...
This article investigates the performance of independent reinforcement learners in multi-agent games...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-...
Being able to accomplish tasks with multiple learners through learning has long been a goal of the m...
Learning behaviors in a multiagent environment is crucial for developing and adapting multiagent sys...
Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate ...
In this thesis, we first suggest a new type of Markov model extended by Watkins’ action replay proce...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
On-line learning methods have been applied successfully in multi-agent systems to achieve coordinati...
We report on an investigation of reinforcement learning tech-niques for the learning of coordination...
We report on an investigation of reinforcement learning techniques for the learning of coordination ...
This paper investigates the problem of policy learning in multiagent environments using the stochast...