International audienceThis paper is about index policies for minimizing (frequentist) regret in a stochastic multi-armed bandit model, inspired by a Bayesian view on the problem. Our main contribution is to prove that the Bayes-UCB algorithm, which relies on quantiles of posterior distributions, is asymptotically optimal when the reward distributions belong to a one-dimensional exponential family, for a large class of prior distributions. We also show that the Bayesian literature gives new insight on what kind of exploration rates could be used in frequentist, UCB-type algorithms. Indeed, approximations of the Bayesian optimal solution or the Finite Horizon Gittins indices provide a justification for the kl-UCB+ and kl-UCB-H+ algorithms, wh...
We consider a class of stochastic sequential allocation problems - restless multi-armed bandits (RMA...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
Dans cette thèse, nous étudions des stratégies d’allocation séquentielle de ressources. Le modèle st...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceOriginally motivated by default risk management applications, this paper inves...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
We consider a class of stochastic sequential allocation problems - restless multi-armed bandits (RMA...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
Dans cette thèse, nous étudions des stratégies d’allocation séquentielle de ressources. Le modèle st...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceOriginally motivated by default risk management applications, this paper inves...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
International audienceWe consider optimal sequential allocation in the context of the so-called stoc...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
We consider a class of stochastic sequential allocation problems - restless multi-armed bandits (RMA...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...