The bandit classification problem considers learning the labels of a time-indexed data stream under a mere " hit-or-miss " binary guiding. Adapting the OVA (" one-versus-all ") hinge loss setup, we develop a sparse and lightweight solution to this problem. The issued sequential norm-minimal update solves the classification problem in finite time in the separable case, provided enough redundancy is present in the data. An O(√ T) regret in moreover expected in the non-separable case. The algorithm shows effectiveness on both large scale text-mining and machine learning datasets, with (i) a favorable comparison with the more demanding confidence-based second-order bandits setups on large scale datasets and (ii) a good sparsity and efficacy whe...
This paper considers no-regret learning for repeated continuous-kernel games with lossy bandit feedb...
International audienceWe consider the problem of online combinatorial optimization under semi-bandit...
This document is the full version of an extended abstract published in the proceedings of COLT 2017....
The bandit classification problem considers learning the labels of a time-indexed data stream under ...
This paper introduces the Banditron, a variant of the Perceptron [Rosenblatt, 1958], for the multicl...
International audienceWe consider online learning problems under a a partial observability model cap...
International audienceIn this paper we consider the problem of online stochastic optimization of a l...
In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlate...
International audienceWe investigate a nonstochastic bandit setting in which the loss of an action i...
International audienceIn the fixed budget thresholding bandit problem, an algorithm sequentially all...
We introduce and study a partial-information model of online learning, where a decision maker repeat...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side...
International audienceWe consider online learning in finite stochastic Markovian environments where ...
We consider the classic online learning and stochastic multi-armed bandit (MAB) problems, when at ea...
This paper considers no-regret learning for repeated continuous-kernel games with lossy bandit feedb...
International audienceWe consider the problem of online combinatorial optimization under semi-bandit...
This document is the full version of an extended abstract published in the proceedings of COLT 2017....
The bandit classification problem considers learning the labels of a time-indexed data stream under ...
This paper introduces the Banditron, a variant of the Perceptron [Rosenblatt, 1958], for the multicl...
International audienceWe consider online learning problems under a a partial observability model cap...
International audienceIn this paper we consider the problem of online stochastic optimization of a l...
In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlate...
International audienceWe investigate a nonstochastic bandit setting in which the loss of an action i...
International audienceIn the fixed budget thresholding bandit problem, an algorithm sequentially all...
We introduce and study a partial-information model of online learning, where a decision maker repeat...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
We consider a novel variant of the contextual bandit problem (i.e., the multi-armed bandit with side...
International audienceWe consider online learning in finite stochastic Markovian environments where ...
We consider the classic online learning and stochastic multi-armed bandit (MAB) problems, when at ea...
This paper considers no-regret learning for repeated continuous-kernel games with lossy bandit feedb...
International audienceWe consider the problem of online combinatorial optimization under semi-bandit...
This document is the full version of an extended abstract published in the proceedings of COLT 2017....