We consider stochastic bandit problems with $K$ arms, each associated with a bounded distribution supported on the range $[m,M]$. We do not assume that the range $[m,M]$ is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which prevents from simultaneously achieving the typical $\ln T$ and $\sqrt{T}$ bounds. For instance, a $\sqrt{T}$}distribution-free regret bound may only be achieved if the distribution-dependent regret bounds are at least of order $\sqrt{T}$. We exhibit a strategy achieving the rates for regret indicated by the new trade-off
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
We consider stochastic multi-armed bandit problems where the expected reward is a Lipschitz function...
We consider a stochastic bandit problem with in-finitely many arms. In this setting, the learner has...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceWe consider $K$-–armed stochastic bandits and consider cumulative regret bound...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceWe consider $K$-–armed stochastic bandits and consider cumulative regret bound...
Regret minimisation in stochastic multi-armed bandits is a well-studied problem, for which several o...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceWe consider $K$-–armed stochastic bandits and consider cumulative regret bound...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
We consider stochastic multi-armed bandit problems where the expected reward is a Lipschitz function...
We consider a stochastic bandit problem with in-finitely many arms. In this setting, the learner has...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceWe consider $K$-–armed stochastic bandits and consider cumulative regret bound...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceWe consider $K$-–armed stochastic bandits and consider cumulative regret bound...
Regret minimisation in stochastic multi-armed bandits is a well-studied problem, for which several o...
International audienceIn the classical multi-armed bandit problem, d arms are available to the decis...
International audienceWe consider $K$-–armed stochastic bandits and consider cumulative regret bound...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
We consider stochastic multi-armed bandit problems where the expected reward is a Lipschitz function...
We consider a stochastic bandit problem with in-finitely many arms. In this setting, the learner has...