We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on $1/\kappa$, where $\kappa$ is the minimal variance over all arms' reward distributions. In general, $1/\kappa$ scales exponentially with the norm of the unknown linear parameter $\theta^*$. Instead of relying on this worst-case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure expl...
We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under ...
Virtual conferenceInternational audienceWe investigate an active pure-exploration setting, that incl...
In online learning problems, exploiting low variance plays an important role in obtaining tight perf...
We study two randomized algorithms for generalized linear bandits, GLM-TSL and GLM-FPL. GLM-TSL samp...
40 pages. AISTATS 2021, oralInternational audienceLogistic Bandits have recently attracted substanti...
In this dissertation we present recent contributions to the problem of optimization under bandit fee...
In sparse linear bandits, a learning agent sequentially selects an action and receive reward feedbac...
We propose a new online algorithm for minimizing the cumulative regret in stochastic linear bandits....
We consider bandit problems involving a large (possibly infinite) collection of arms, in which the e...
The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems i...
We study two model selection settings in stochastic linear bandits (LB). In the first setting, which...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
We consider the problem of online learning in misspecified linear stochastic multi-armed bandit prob...
We improve the theoretical analysis and empirical performance of algorithms for the stochastic multi...
We improve the theoretical analysis and empirical performance of algorithms for the stochastic multi...
We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under ...
Virtual conferenceInternational audienceWe investigate an active pure-exploration setting, that incl...
In online learning problems, exploiting low variance plays an important role in obtaining tight perf...
We study two randomized algorithms for generalized linear bandits, GLM-TSL and GLM-FPL. GLM-TSL samp...
40 pages. AISTATS 2021, oralInternational audienceLogistic Bandits have recently attracted substanti...
In this dissertation we present recent contributions to the problem of optimization under bandit fee...
In sparse linear bandits, a learning agent sequentially selects an action and receive reward feedbac...
We propose a new online algorithm for minimizing the cumulative regret in stochastic linear bandits....
We consider bandit problems involving a large (possibly infinite) collection of arms, in which the e...
The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems i...
We study two model selection settings in stochastic linear bandits (LB). In the first setting, which...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
We consider the problem of online learning in misspecified linear stochastic multi-armed bandit prob...
We improve the theoretical analysis and empirical performance of algorithms for the stochastic multi...
We improve the theoretical analysis and empirical performance of algorithms for the stochastic multi...
We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under ...
Virtual conferenceInternational audienceWe investigate an active pure-exploration setting, that incl...
In online learning problems, exploiting low variance plays an important role in obtaining tight perf...