In this paper, we consider efficient learning in large-scale combinatorial semi-bandits with linear generalization, and as a solution, propose a novel learning al-gorithm called Randomized Combinatorial Maximization (RCM). RCM is motivated by Thompson sampling, and is computationally efficient as long as the offline version of the combinatorial problem can be solved efficiently. We establish that RCM is prov-ably statistically efficient in the coherent Gaussian case, by developing a Bayes regret bound that is independent of the problem scale (number of items) and sublinear in time. We also evaluate RCM on a variety of real-world problems with thousands of items. Our experimental results demonstrate that RCM learns two orders of magnitude fa...
AbstractThis article is a brief exposition of some of the important links between machine learning a...
Multi-Armed Bandit (MAB) framework has been successfully applied in many web applications. However, ...
In this paper, we first study the problem of combinatorial pure exploration with full-bandit feedbac...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
International audienceWe consider combinatorial semi-bandits over a set X ⊂ {0, 1} d where rewards a...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
We study the problem of combinatorial pure exploration in the stochastic multi-armed bandit problem....
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
International audienceWe consider combinatorial semi-bandits over a set of arms X ⊂ {0, 1} d where r...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
This report is a brief exposition of some of the important links between machine learning and combin...
AbstractThis article is a brief exposition of some of the important links between machine learning a...
Multi-Armed Bandit (MAB) framework has been successfully applied in many web applications. However, ...
In this paper, we first study the problem of combinatorial pure exploration with full-bandit feedbac...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
International audienceWe consider combinatorial semi-bandits over a set X ⊂ {0, 1} d where rewards a...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
We study the problem of combinatorial pure exploration in the stochastic multi-armed bandit problem....
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
International audienceWe consider combinatorial semi-bandits over a set of arms X ⊂ {0, 1} d where r...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
This report is a brief exposition of some of the important links between machine learning and combin...
AbstractThis article is a brief exposition of some of the important links between machine learning a...
Multi-Armed Bandit (MAB) framework has been successfully applied in many web applications. However, ...
In this paper, we first study the problem of combinatorial pure exploration with full-bandit feedbac...