International audienceThis work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: pseudo-regret, expected regret, high probability regret and tracking the best expert regret. We introduce a new forecaster, INF (Implicitly Normalized Forecaster) based on an arbitrary function ψ for which we propose a unified analysis of its pseudo-regret in the four games we consider. In particular, for ψ(x)=exp(η x) + γ/K, INF reduces to the classical exponentially weighted average forecaster and our analysis of the pseudo-regret recovers known results while for the expected regret we slightly tighten the bounds. On the other hand with ψ(x)=...
Cascading bandits is a natural and popular model that frames the task of learning to rank from Berno...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
AbstractWe consider the framework of stochastic multi-armed bandit problems and study the possibilit...
International audienceThis work deals with four classical prediction settings, namely full informati...
23 pagesInternational audienceWe address the online linear optimization problem when the actions of ...
We address the online linear optimization problem when the actions of the forecaster are represented...
We address online linear optimization problems when the possible actions of the decision maker are r...
This work studies external regret in sequential prediction games with both positive and negative pay...
In a partial monitoring game, the learner repeatedly chooses an action, the environment responds wit...
Classical regret minimization in a bandit frame-work involves a number of probability distributions ...
Partial monitoring is a rich framework for sequential decision making under uncertainty that general...
We consider the finite-horizon multi-armed bandit problem under the standard stochastic assumption o...
International audienceAlgorithms based on upper-confidence bounds for balancing exploration and expl...
We consider the framework of stochastic multi-armed bandit problems and study the possibilities and ...
International audienceWe consider the framework of stochastic multi-armed bandit problems and study ...
Cascading bandits is a natural and popular model that frames the task of learning to rank from Berno...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
AbstractWe consider the framework of stochastic multi-armed bandit problems and study the possibilit...
International audienceThis work deals with four classical prediction settings, namely full informati...
23 pagesInternational audienceWe address the online linear optimization problem when the actions of ...
We address the online linear optimization problem when the actions of the forecaster are represented...
We address online linear optimization problems when the possible actions of the decision maker are r...
This work studies external regret in sequential prediction games with both positive and negative pay...
In a partial monitoring game, the learner repeatedly chooses an action, the environment responds wit...
Classical regret minimization in a bandit frame-work involves a number of probability distributions ...
Partial monitoring is a rich framework for sequential decision making under uncertainty that general...
We consider the finite-horizon multi-armed bandit problem under the standard stochastic assumption o...
International audienceAlgorithms based on upper-confidence bounds for balancing exploration and expl...
We consider the framework of stochastic multi-armed bandit problems and study the possibilities and ...
International audienceWe consider the framework of stochastic multi-armed bandit problems and study ...
Cascading bandits is a natural and popular model that frames the task of learning to rank from Berno...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
AbstractWe consider the framework of stochastic multi-armed bandit problems and study the possibilit...