35 pages, 14 figuresInternational audienceIn many online decision processes, the optimizing agent is called to choose between large numbers of alternatives with many inherent similarities; in turn, these similarities imply closely correlated losses that may confound standard discrete choice models and bandit algorithms. We study this question in the context of nested bandits, a class of adversarial multi-armed bandit problems where the learner seeks to minimize their regret in the presence of a large number of distinct alternatives with a hierarchy of embedded (non-combinatorial) similarities. In this setting, optimal algorithms based on the exponential weights blueprint (like Hedge, EXP3, and their variants) may incur significant regret be...
We study the power of different types of adaptive (nonoblivious) adversaries in the setting of predi...
Online learning algorithms are designed to learn even when their input is generated by an adversary....
We study the problem of decision-making under uncertainty in the bandit setting. This thesis goes be...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
We study a partial-information online-learning problem where actions are restricted to noisy compar...
In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlate...
AbstractWe study a partial-information online-learning problem where actions are restricted to noisy...
We study a new class of online learning problems where each of the online algorithm’s actions is ass...
AbstractThe nonstochastic multi-armed bandit problem, first studied by Auer, Cesa-Bianchi, Freund, a...
This thesis investigates sequential decision making tasks that fall in the framework of reinforcemen...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
We consider running multiple instances of multi-armed bandit (MAB) problems in parallel. A main moti...
Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness i...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
We study the power of different types of adaptive (nonoblivious) adversaries in the setting of predi...
Online learning algorithms are designed to learn even when their input is generated by an adversary....
We study the problem of decision-making under uncertainty in the bandit setting. This thesis goes be...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
International audienceThis paper investigates stochastic and adversarial combinatorial multi-armed b...
We study a partial-information online-learning problem where actions are restricted to noisy compar...
In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlate...
AbstractWe study a partial-information online-learning problem where actions are restricted to noisy...
We study a new class of online learning problems where each of the online algorithm’s actions is ass...
AbstractThe nonstochastic multi-armed bandit problem, first studied by Auer, Cesa-Bianchi, Freund, a...
This thesis investigates sequential decision making tasks that fall in the framework of reinforcemen...
In a bandit problem there is a set of arms, each of which when played by an agent yields some reward...
We consider running multiple instances of multi-armed bandit (MAB) problems in parallel. A main moti...
Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness i...
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the...
We study the power of different types of adaptive (nonoblivious) adversaries in the setting of predi...
Online learning algorithms are designed to learn even when their input is generated by an adversary....
We study the problem of decision-making under uncertainty in the bandit setting. This thesis goes be...