International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-bandits. In most interesting problems, state-of-the-art algorithms take advantage of structural properties of rewards, such as independence. However, while being optimal in terms of asymptotic regret, these algorithms are inefficient. In our paper, we first reduce their implementation to a specific submod-ular maximization. Then, in case of matroid constraints , we design adapted approximation routines , thereby providing the first efficient algorithms that rely on reward structure to improve regret bound. In particular, we improve the state-of-the-art efficient gap-free regret bound by a factor √ m/ log m, where m is the maximum action size. Fin...
Many important optimization problems, such as the minimum spanning tree and minimum-cost flow, can b...
International audienceThis paper introduces and addresses a wide class of stochastic bandit problems...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
The contextual combinatorial semi-bandit problem with linear payoff functions is a decision-making p...
International audienceWe consider combinatorial semi-bandits over a set X ⊂ {0, 1} d where rewards a...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
International audienceWe consider combinatorial semi-bandits over a set of arms X ⊂ {0, 1} d where r...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
A matroid is a notion of independence in combi-natorial optimization which is closely related to com...
A matroid is a notion of independence in combi-natorial optimization that characterizes problems tha...
We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, w...
Many important optimization problems, such as the minimum spanning tree and minimum-cost flow, can b...
International audienceThis paper introduces and addresses a wide class of stochastic bandit problems...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...
International audienceWe improve the efficiency of algorithms for stochastic combinatorial semi-band...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit is an on-line learning problem where at each step a learn-ing...
A stochastic combinatorial semi-bandit with a linear payoff is a sequential learning problem where a...
The contextual combinatorial semi-bandit problem with linear payoff functions is a decision-making p...
International audienceWe consider combinatorial semi-bandits over a set X ⊂ {0, 1} d where rewards a...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
International audienceWe consider combinatorial semi-bandits over a set of arms X ⊂ {0, 1} d where r...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
A matroid is a notion of independence in combi-natorial optimization which is closely related to com...
A matroid is a notion of independence in combi-natorial optimization that characterizes problems tha...
We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, w...
Many important optimization problems, such as the minimum spanning tree and minimum-cost flow, can b...
International audienceThis paper introduces and addresses a wide class of stochastic bandit problems...
Multi-Armed Bandits (MAB) constitute the most fundamental model for sequential decision making probl...