We study the problem of information sharing and cooperation in Multi-Player Multi-Armed bandits. We propose the first algorithm that achieves logarithmic regret for this problem when the collision reward is unknown. Our results are based on two innovations. First, we show that a simple modification to a successive elimination strategy can be used to allow the players to estimate their suboptimality gaps, up to constant factors, in the absence of collisions. Second, we leverage the first result to design a communication protocol that successfully uses the small reward of collisions to coordinate among players, while preserving meaningful instance-dependent logarithmic regret guarantees.Comment: 50 page
We study a decentralized cooperative stochastic multi-armed bandit problem with K arms on a network ...
International audienceThis paper introduces a general multi-agent bandit model in which each agent i...
works released after June 2022 are not considered in this surveyDue mostly to its application to cog...
International audienceWe study a multiplayer stochastic multi-armed bandit problem in which players ...
The absence of collision information in Multi- player Multi-armed bandits (MMABs) renders arm availa...
Multi-player multi-armed bandit is an increasingly relevant decision-making problem, motivated by ap...
This paper studies a cooperative multi-agent multi-armed stochastic bandit problem where agents oper...
We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown v...
International audienceWe propose a novel algorithm for multi-player multi-armed bandits without coll...
The stochastic multi-armed bandit model captures the tradeoff between exploration and exploitation. ...
Recently, there has been extensive study of cooperative multi-agent multi-armed bandits where a set ...
The uncoordinated spectrum access problem is studied using a multi-player multi-armed bandits framew...
International audienceMotivated by cognitive radio networks, we consider the stochastic multiplayer ...
Multiplayer bandits have recently been extensively studied because of their application to cognitive...
International audienceMotivated by cognitive radios, stochastic multi-player multi-armed bandits gai...
We study a decentralized cooperative stochastic multi-armed bandit problem with K arms on a network ...
International audienceThis paper introduces a general multi-agent bandit model in which each agent i...
works released after June 2022 are not considered in this surveyDue mostly to its application to cog...
International audienceWe study a multiplayer stochastic multi-armed bandit problem in which players ...
The absence of collision information in Multi- player Multi-armed bandits (MMABs) renders arm availa...
Multi-player multi-armed bandit is an increasingly relevant decision-making problem, motivated by ap...
This paper studies a cooperative multi-agent multi-armed stochastic bandit problem where agents oper...
We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown v...
International audienceWe propose a novel algorithm for multi-player multi-armed bandits without coll...
The stochastic multi-armed bandit model captures the tradeoff between exploration and exploitation. ...
Recently, there has been extensive study of cooperative multi-agent multi-armed bandits where a set ...
The uncoordinated spectrum access problem is studied using a multi-player multi-armed bandits framew...
International audienceMotivated by cognitive radio networks, we consider the stochastic multiplayer ...
Multiplayer bandits have recently been extensively studied because of their application to cognitive...
International audienceMotivated by cognitive radios, stochastic multi-player multi-armed bandits gai...
We study a decentralized cooperative stochastic multi-armed bandit problem with K arms on a network ...
International audienceThis paper introduces a general multi-agent bandit model in which each agent i...
works released after June 2022 are not considered in this surveyDue mostly to its application to cog...