Ce document a été accepté pour publication à AI&Statistics 2011. Je dois me référer à ce travail, je me réfère donc à ceci en attendant de publier la version camera-ready (le mois prochain).We consider multi-armed bandit games with possibly adaptive opponents. We introduce models Theta of constraints based on equivalence classes on the common history (information shared by the player and the opponent) which dene two learning scenarios: (1) The opponent is constrained, i.e. he provides rewards that are stochastic functions of equivalence classes dened by some model theta*\in Theta. The regret is measured with respect to (w.r.t.) the best history-dependent strategy. (2) The opponent is arbitrary and we measure the regret w.r.t. the best strat...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
The multi-armed bandit is a framework allowing the study of the trade-off between exploration and ex...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
AbstractThe nonstochastic multi-armed bandit problem, first studied by Auer, Cesa-Bianchi, Freund, a...
International audienceThis work addresses the problem of regret minimization in non-stochastic multi...
International audienceWe consider a stochastic bandit problem with infinitely many arms. In this set...
This thesis studies the following topics in Machine Learning: Bandit theory, Statistical learning an...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
We consider a situation where an agent has $T$ ressources to be allocated to a larger number $N$ of ...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
This manuscript deals with the estimation of the optimal rule and its meanreward in a simple ban...
We consider a bandit problem where at any time, the decision maker can add new arms to her considera...
In X-armed bandit problem an agent sequentially interacts with environment which yields a reward bas...
A Multi-Armed Bandits (MAB) is a learning problem where an agent sequentially chooses an action amon...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
The multi-armed bandit is a framework allowing the study of the trade-off between exploration and ex...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
The main topics adressed in this thesis lie in the general domain of sequential learning, and in par...
AbstractThe nonstochastic multi-armed bandit problem, first studied by Auer, Cesa-Bianchi, Freund, a...
International audienceThis work addresses the problem of regret minimization in non-stochastic multi...
International audienceWe consider a stochastic bandit problem with infinitely many arms. In this set...
This thesis studies the following topics in Machine Learning: Bandit theory, Statistical learning an...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
We consider a situation where an agent has $T$ ressources to be allocated to a larger number $N$ of ...
International audienceOver the past few years, the multi-armed bandit model has become increasingly ...
This manuscript deals with the estimation of the optimal rule and its meanreward in a simple ban...
We consider a bandit problem where at any time, the decision maker can add new arms to her considera...
In X-armed bandit problem an agent sequentially interacts with environment which yields a reward bas...
A Multi-Armed Bandits (MAB) is a learning problem where an agent sequentially chooses an action amon...
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit...
The multi-armed bandit is a framework allowing the study of the trade-off between exploration and ex...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...