In this work, we address the combinatorial optimization problem in the stochastic bandit setting with bandit feedback. We propose to use the seminal Thompson Sampling algorithm under an assumption on rewards expectations. More specif-ically, we tackle the online feature selection problem where results show that Thompson Sampling performs well. Additionnally, we discuss the challenges associated with online feature selection and highlight relevant future work directions
Presented at the Thirty-eighth International Conference on Machine Learning (ICML 2021)International...
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with g...
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with g...
We present a simple set of algorithms based on Thompson Sampling for stochastic bandit problems with...
We consider stochastic multi-armed bandit prob-lems with complex actions over a set of basic arms, w...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
The aim of the research presented in this dissertation is to construct a model for personalised item...
The multi-armed bandit problem is a popular model for studying exploration/exploitation trade-off in...
The aim of the research presented in this dissertation is to construct a model for personalised item...
We address the problem of online sequential decision making, i.e., balancing the trade-off between e...
The multi-armed bandit (MAB) problem provides a convenient abstraction for many online decision prob...
Thompson Sampling has recently been shown to achieve the lower bound on regret in the Bernoulli Mult...
International audienceThe Thompson Sampling exhibits excellent results in practice and it has been s...
Thompson Sampling (TS) has surged a lot of interest due to its good empirical performance, in partic...
Presented at the Thirty-eighth International Conference on Machine Learning (ICML 2021)International...
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with g...
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with g...
We present a simple set of algorithms based on Thompson Sampling for stochastic bandit problems with...
We consider stochastic multi-armed bandit prob-lems with complex actions over a set of basic arms, w...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
The aim of the research presented in this dissertation is to construct a model for personalised item...
The multi-armed bandit problem is a popular model for studying exploration/exploitation trade-off in...
The aim of the research presented in this dissertation is to construct a model for personalised item...
We address the problem of online sequential decision making, i.e., balancing the trade-off between e...
The multi-armed bandit (MAB) problem provides a convenient abstraction for many online decision prob...
Thompson Sampling has recently been shown to achieve the lower bound on regret in the Bernoulli Mult...
International audienceThe Thompson Sampling exhibits excellent results in practice and it has been s...
Thompson Sampling (TS) has surged a lot of interest due to its good empirical performance, in partic...
Presented at the Thirty-eighth International Conference on Machine Learning (ICML 2021)International...
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with g...
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with g...