International audienceWe consider a generalization of stochastic bandit problems where the set of arms, X, is allowed to be a generic topological space. We constraint the mean-payoff function with a dissimilarity function over X in a way that is more general than Lipschitz. We construct an arm selection policy whose regret improves upon previous result for a large class of problems. In particular, our results imply that if X is the unit hypercube in a Euclidean space and the mean-payoff function has a finite number of global maxima around which the behavior of the function is locally Holder with a known exponent, then the expected regret is bounded up to a logarithmic factor by $\sqrt{n}$, i.e., the rate of the growth of the regret is indep...
Cette thèse s'inscrit dans les domaines de l'apprentissage statistique et de la statistique séquenti...
In contextual continuum-armed bandits, the contexts $x$ and the arms $y$ are both continuous and dra...
International audienceBandit algorithms are concerned with trading exploration with exploitation whe...
International audienceWe consider a generalization of stochastic bandit problems where the set of ar...
International audienceWe consider a generalization of stochastic bandits where the set of arms, $\cX...
International audienceWe consider the setting of stochastic bandit problems with a continuum of arms...
International audienceIn this paper we consider the problem of online stochastic optimization of a l...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e., those sequential selectio...
Regret minimisation in stochastic multi-armed bandits is a well-studied problem, for which several o...
We consider a stochastic bandit problem with a possibly infinite number of arms. We write p∗ for the...
We consider stochastic multi-armed bandit problems where the expected reward is a Lipschitz function...
In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlate...
We consider a bandit problem where at any time, the decision maker can add new arms to her considera...
International audienceWe consider a stochastic bandit problem with infinitely many arms. In this set...
Cette thèse s'inscrit dans les domaines de l'apprentissage statistique et de la statistique séquenti...
In contextual continuum-armed bandits, the contexts $x$ and the arms $y$ are both continuous and dra...
International audienceBandit algorithms are concerned with trading exploration with exploitation whe...
International audienceWe consider a generalization of stochastic bandit problems where the set of ar...
International audienceWe consider a generalization of stochastic bandits where the set of arms, $\cX...
International audienceWe consider the setting of stochastic bandit problems with a continuum of arms...
International audienceIn this paper we consider the problem of online stochastic optimization of a l...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e., those sequential selectio...
Regret minimisation in stochastic multi-armed bandits is a well-studied problem, for which several o...
We consider a stochastic bandit problem with a possibly infinite number of arms. We write p∗ for the...
We consider stochastic multi-armed bandit problems where the expected reward is a Lipschitz function...
In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlate...
We consider a bandit problem where at any time, the decision maker can add new arms to her considera...
International audienceWe consider a stochastic bandit problem with infinitely many arms. In this set...
Cette thèse s'inscrit dans les domaines de l'apprentissage statistique et de la statistique séquenti...
In contextual continuum-armed bandits, the contexts $x$ and the arms $y$ are both continuous and dra...
International audienceBandit algorithms are concerned with trading exploration with exploitation whe...