International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-bandits. In most interesting problems, state-of-the-art algorithms take advantage of structural properties of rewards, such as independence. However, while being optimal in terms of asymptotic regret, these algorithms are inefficient. In our paper, we first reduce their implementation to a specific submod-ular maximization. Then, in case of matroid constraints , we design adapted approximation routines , thereby providing the first efficient algorithms that rely on reward structure to improve regret bound. In particular, we improve the state-of-the-art efficient gap-free regret bound by a factor √ m/ log m, where m is the maximum action size. Fin...
International audienceWe propose a Bayesian information-geometric approach to the exploration-exploi...
We propose a new online algorithm for minimizing the cumulative regret in stochastic linear bandits....
We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, w...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
Combinatorial stochastic semi-bandits appear naturally in many contexts where the exploration/exploi...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
Combinatorial stochastic semi-bandits appear naturally in many contexts where the exploration/exploi...
The contextual combinatorial semi-bandit problem with linear payoff functions is a decision-making p...
International audienceWe consider the problem of online combinatorial optimization under semi-bandit...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
International audienceWe consider combinatorial semi-bandits over a set of arms X ⊂ {0, 1} d where r...
International audienceWe consider combinatorial semi-bandits over a set X ⊂ {0, 1} d where rewards a...
International audienceWe propose a Bayesian information-geometric approach to the exploration-exploi...
We propose a new online algorithm for minimizing the cumulative regret in stochastic linear bandits....
We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, w...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
Combinatorial stochastic semi-bandits appear naturally in many contexts where the exploration/exploi...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
Combinatorial stochastic semi-bandits appear naturally in many contexts where the exploration/exploi...
The contextual combinatorial semi-bandit problem with linear payoff functions is a decision-making p...
International audienceWe consider the problem of online combinatorial optimization under semi-bandit...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
International audienceWe consider combinatorial semi-bandits over a set of arms X ⊂ {0, 1} d where r...
International audienceWe consider combinatorial semi-bandits over a set X ⊂ {0, 1} d where rewards a...
International audienceWe propose a Bayesian information-geometric approach to the exploration-exploi...
We propose a new online algorithm for minimizing the cumulative regret in stochastic linear bandits....
We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, w...