International audienceIn the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions. It then predicts whether the mean of each distribution is larger or lower than a given threshold. We introduce a large family of algorithms (containing most existing relevant ones), inspired by the Frank-Wolfe algorithm, and provide a thorough yet generic analysis of their performance. This allowed us to construct new explicit algorithms, for a broad class of problems, whose losses are within a small constant factor of the non-adaptive oracle ones. Quite interestingly, we observed that adaptive methods empirically greatly out-perform non-adaptive oracles, an uncommon behavior in ...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
International audienceIn the fixed budget thresholding bandit problem, an algorithm sequentially all...
This paper introduces the Banditron, a variant of the Perceptron [Rosenblatt, 1958], for the multicl...
The bandit classification problem considers learning the labels of a time-indexed data stream under ...
We consider a bandit problem where at any time, the decision maker can add new arms to her considera...
The increasing size of available data has led machine learning specialists to consider more complex ...
We demonstrate a modification of the algorithm of Dani et al for the online linear optimization prob...
International audienceElimination algorithms for bandit identification, which prune the plausible co...
This thesis studies the following topics in Machine Learning: Bandit theory, Statistical learning an...
We consider the problem of online learning in misspecified linear stochastic multi-armed bandit prob...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, whe...
This document presents in a unified way different results about the optimal solution of several mult...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
International audienceIn the fixed budget thresholding bandit problem, an algorithm sequentially all...
This paper introduces the Banditron, a variant of the Perceptron [Rosenblatt, 1958], for the multicl...
The bandit classification problem considers learning the labels of a time-indexed data stream under ...
We consider a bandit problem where at any time, the decision maker can add new arms to her considera...
The increasing size of available data has led machine learning specialists to consider more complex ...
We demonstrate a modification of the algorithm of Dani et al for the online linear optimization prob...
International audienceElimination algorithms for bandit identification, which prune the plausible co...
This thesis studies the following topics in Machine Learning: Bandit theory, Statistical learning an...
We consider the problem of online learning in misspecified linear stochastic multi-armed bandit prob...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
We study a variant of the thresholding bandit problem (TBP) in the context of outlier detection, whe...
This document presents in a unified way different results about the optimal solution of several mult...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...