Some game theory approaches to solve multiagent reinforce-ment learning in self play, i.e. when agents use the same al-gorithm for choosing action, employ equilibriums, such as the Nash equilibrium, to compute the policies of the agents. These approaches have been applied only on simple exam-ples. In this paper, we present an extended version of Nash Q-Learning using the Stackelberg equilibrium to address a wider range of games than with the Nash Q-Learning. We show that mixing the Nash and Stackelberg equilibriums can lead to better rewards not only in static games but also in stochastic games. Moreover, we apply the algorithm to a real world example, the automated vehicle coordination problem
This article investigates the performance of independent reinforcement learners in multi-agent games...
This paper investigates a relatively new direction in Mul-tiagent Reinforcement Learning. Most multi...
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-...
We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stoc...
Recently, there have been several attempts to design multiagent Q-learning algorithms that learn equ...
Repeated play in games by simple adaptive agents is investigated. The agents use Q-learning, a speci...
The single-agent multi-armed bandit problem can be solved by an agent that learns the values of each...
This paper introduces Correlated-Q (CE-Q) learning, a multiagent Q-learning algorithm based on the c...
This paper introduces Correlated-Q (CE-Q) learning, a multiagent Q-learning algorithm based on the c...
Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate ...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
This paper describes an approach to rein-forcement learning in multiagent general-sum games in which...
This thesis presents a modified Q-learning algorithm and provides conditions for convergence to a pu...
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents ’ ...
Learning in the real world occurs when an agent, which perceives its current state and takes actions...
This article investigates the performance of independent reinforcement learners in multi-agent games...
This paper investigates a relatively new direction in Mul-tiagent Reinforcement Learning. Most multi...
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-...
We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stoc...
Recently, there have been several attempts to design multiagent Q-learning algorithms that learn equ...
Repeated play in games by simple adaptive agents is investigated. The agents use Q-learning, a speci...
The single-agent multi-armed bandit problem can be solved by an agent that learns the values of each...
This paper introduces Correlated-Q (CE-Q) learning, a multiagent Q-learning algorithm based on the c...
This paper introduces Correlated-Q (CE-Q) learning, a multiagent Q-learning algorithm based on the c...
Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate ...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
This paper describes an approach to rein-forcement learning in multiagent general-sum games in which...
This thesis presents a modified Q-learning algorithm and provides conditions for convergence to a pu...
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents ’ ...
Learning in the real world occurs when an agent, which perceives its current state and takes actions...
This article investigates the performance of independent reinforcement learners in multi-agent games...
This paper investigates a relatively new direction in Mul-tiagent Reinforcement Learning. Most multi...
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-...