Model-based reinforcement learning (RL), which finds an optimal policy using an empirical model, has long been recognized as one of the corner stones of RL. It is especially suitable for multi-agent RL (MARL), as it naturally decouples the learning and the planning phases, and avoids the non-stationarity problem when all agents are improving their policies simultaneously using samples. Though intuitive, easy-to-implement, and widely-used, the sample complexity of model-based MARL algorithms has not been fully investigated. In this paper, our goal is to address the fundamental question about its sample complexity. We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative mod...
textThe problem of multiagent learning (or MAL) is concerned with the study of how agents can learn ...
Various types of Multi-Agent Reinforcement Learning (MARL) methods have been developed, assuming tha...
Algorithms designed for single-agent reinforcement learning (RL) generally fail to converge to equil...
International audienceWe consider the problem of learning the optimal action-value function in disco...
International audienceWe consider the problem of learning the optimal action-value function in the d...
This paper studies policy optimization algorithms for multi-agent reinforcement learning. We begin b...
Reinforcement Learning (RL) has achieved tremendous empirical successes in real-world decision-makin...
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents ’ ...
In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem ...
With the increasing need for handling large state and action spaces, general function approximation ...
The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular s...
Multi-Agent Reinforcement Learning (MARL) -- where multiple agents learn to interact in a shared dyn...
Markov games is a framework which can be used to formalise n-agent reinforcement learning (RL). Litt...
Abstract We consider the problem of learning the optimal action-value func-tion in discounted-reward...
We consider model-based multi-agent reinforcement learning, where the environment transition model i...
textThe problem of multiagent learning (or MAL) is concerned with the study of how agents can learn ...
Various types of Multi-Agent Reinforcement Learning (MARL) methods have been developed, assuming tha...
Algorithms designed for single-agent reinforcement learning (RL) generally fail to converge to equil...
International audienceWe consider the problem of learning the optimal action-value function in disco...
International audienceWe consider the problem of learning the optimal action-value function in the d...
This paper studies policy optimization algorithms for multi-agent reinforcement learning. We begin b...
Reinforcement Learning (RL) has achieved tremendous empirical successes in real-world decision-makin...
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents ’ ...
In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem ...
With the increasing need for handling large state and action spaces, general function approximation ...
The curse of dimensionality is a widely known issue in reinforcement learning (RL). In the tabular s...
Multi-Agent Reinforcement Learning (MARL) -- where multiple agents learn to interact in a shared dyn...
Markov games is a framework which can be used to formalise n-agent reinforcement learning (RL). Litt...
Abstract We consider the problem of learning the optimal action-value func-tion in discounted-reward...
We consider model-based multi-agent reinforcement learning, where the environment transition model i...
textThe problem of multiagent learning (or MAL) is concerned with the study of how agents can learn ...
Various types of Multi-Agent Reinforcement Learning (MARL) methods have been developed, assuming tha...
Algorithms designed for single-agent reinforcement learning (RL) generally fail to converge to equil...