We consider a rate-constrained contextual multiarmed bandit (RC-CMAB) problem, in which a group of agents are solving the same contextual multi-armed bandit (CMAB) problem. However, the contexts are observed by a remotely connected entity, i.e., the decision-maker, that updates the policy to maximize the returned rewards, and communicates the arms to be sampled by the agents to a controller over a rate-limited communications channel. This framework can be applied to personalized ad placement, whenever the content owner observes the website visitors, and hence has the context, but needs to transmit the ads to be shown to a controller that is in charge of placing the marketing content. Consequently, the rateconstrained CMAB (RC-CMAB) problem ...
© 2017 Neural information processing systems foundation. All rights reserved. We consider the proble...
Contextual bandits with linear payoffs, which are also known as linear bandits, provide a powerful a...
International audienceMotivated by cognitive radio networks, we consider the stochastic multiplayer ...
We consider a remote contextual multi-armed bandit (CMAB) problem, in which the decision-maker obser...
The bandit problem models a sequential decision process between a player and an environment. In the ...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
The data explosion and development of artificial intelligence (AI) has fueled the demand for recomme...
We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side...
peer reviewedA collaborative task is assigned to a multiagent system (MAS) in which agents are allow...
Multi-armed bandits (MAB) have attracted much attention as a means of capturing the exploration and ...
A standard assumption in contextual multi-arm bandit is that the true context is perfectly known bef...
Machine and Statistical Learning techniques are used in almost all online advertisement systems. The...
We study throughput utility maximization in a multi-user network with partially observable Markovian...
We consider the scenario of a cognitive radio network overlaying on top of a legacy primary network ...
We consider the problem of multiple users targeting the arms of a single multi-armed stochastic band...
© 2017 Neural information processing systems foundation. All rights reserved. We consider the proble...
Contextual bandits with linear payoffs, which are also known as linear bandits, provide a powerful a...
International audienceMotivated by cognitive radio networks, we consider the stochastic multiplayer ...
We consider a remote contextual multi-armed bandit (CMAB) problem, in which the decision-maker obser...
The bandit problem models a sequential decision process between a player and an environment. In the ...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
The data explosion and development of artificial intelligence (AI) has fueled the demand for recomme...
We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side...
peer reviewedA collaborative task is assigned to a multiagent system (MAS) in which agents are allow...
Multi-armed bandits (MAB) have attracted much attention as a means of capturing the exploration and ...
A standard assumption in contextual multi-arm bandit is that the true context is perfectly known bef...
Machine and Statistical Learning techniques are used in almost all online advertisement systems. The...
We study throughput utility maximization in a multi-user network with partially observable Markovian...
We consider the scenario of a cognitive radio network overlaying on top of a legacy primary network ...
We consider the problem of multiple users targeting the arms of a single multi-armed stochastic band...
© 2017 Neural information processing systems foundation. All rights reserved. We consider the proble...
Contextual bandits with linear payoffs, which are also known as linear bandits, provide a powerful a...
International audienceMotivated by cognitive radio networks, we consider the stochastic multiplayer ...