We investigate multi-agent reinforcement learning for stochastic games with complex tasks, where the reward functions are non-Markovian. We utilize reward machines to incorporate high-level knowledge of complex tasks. We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent. In QRM-SG, we define the Q-function at a Nash equilibrium in augmented state space. The augmented state space integrates the state of the stochastic game and the state of reward machines. Each agent learns the Q-functions of all agents in the system. We prove that Q-functions learned in QRM-SG converge to the Q-functions at a Nash equilibrium if the stage game at ...
This paper introduces a new multi-agent learning algorithm for stochastic games based on replicator ...
Solving multi-agent reinforcement learning problems has proven difficult because of the lack of trac...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stoc...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
http://portal.acm.org/We describe a process of reinforcement learning in two-agent general-sum stoch...
Model-free learning for multi-agent stochastic games is an active area of research. Existing reinfor...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
Algorithmically designed reward functions can influence groups of learning agents toward measurable ...
A large class of sequential decision making problems under uncertainty with multiple competing decis...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
Some game theory approaches to solve multiagent reinforce-ment learning in self play, i.e. when agen...
Recently, there have been several attempts to design multiagent Q-learning algorithms that learn equ...
In this thesis, we explore the use of policy approximation for reducing the computational cost of le...
Multi-agent reinforcement learning (MARL) has become effective in tackling discrete cooperative game...
This paper introduces a new multi-agent learning algorithm for stochastic games based on replicator ...
Solving multi-agent reinforcement learning problems has proven difficult because of the lack of trac...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stoc...
In this paper, we address the problem of convergence to Nash equilibria in games with rewards that a...
http://portal.acm.org/We describe a process of reinforcement learning in two-agent general-sum stoch...
Model-free learning for multi-agent stochastic games is an active area of research. Existing reinfor...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
Algorithmically designed reward functions can influence groups of learning agents toward measurable ...
A large class of sequential decision making problems under uncertainty with multiple competing decis...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
Some game theory approaches to solve multiagent reinforce-ment learning in self play, i.e. when agen...
Recently, there have been several attempts to design multiagent Q-learning algorithms that learn equ...
In this thesis, we explore the use of policy approximation for reducing the computational cost of le...
Multi-agent reinforcement learning (MARL) has become effective in tackling discrete cooperative game...
This paper introduces a new multi-agent learning algorithm for stochastic games based on replicator ...
Solving multi-agent reinforcement learning problems has proven difficult because of the lack of trac...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...