Dans cette thèse, nous étudions des stratégies d’allocation séquentielle de ressources. Le modèle statistique adopté dans ce cadre est celui du bandit stochastique à plusieurs bras. Dans ce modèle, lorsqu’un agent tire un bras du bandit, il reçoit pour récompense une réalisation d’une distribution de probabilité associée au bras. Nous nous intéressons à deux problèmes de bandit différents : la maximisation de la somme des récompenses et l’identification des meilleurs bras (où l’agent cherche à identifier le ou les bras conduisant à la meilleure récompense moyenne, sans subir de perte lorsqu’il tire un «mauvais» bras). Nous nous attachons à proposer pour ces deux objectifs des stratégies de tirage des bras, aussi appelées algorithmes de band...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
Cette thèse, à la croisée entre les domaines de l’intelligence artificielle, de la statistique séque...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
This document presents in a unified way different results about the optimal solution of several mult...
This document presents in a unified way different results about the optimal solution of several mult...
Cette thèse s'inscrit dans les domaines de l'apprentissage statistique et de la statistique séquenti...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
Cette thèse, à la croisée entre les domaines de l’intelligence artificielle, de la statistique séque...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
This document presents in a unified way different results about the optimal solution of several mult...
This document presents in a unified way different results about the optimal solution of several mult...
Cette thèse s'inscrit dans les domaines de l'apprentissage statistique et de la statistique séquenti...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
International audienceThis paper is about index policies for minimizing (frequentist) regret in a st...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
Cette thèse, à la croisée entre les domaines de l’intelligence artificielle, de la statistique séque...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...