Abstract. We present and study a partial-information model of online learning, where a decision maker repeatedly chooses from a finite set of actions, and observes some subset of the associated losses. This naturally models several situations where the losses of different actions are related, and knowing the loss of one action provides information on the loss of other actions. Moreover, it generalizes and interpolates between the well studied full-information setting (where all losses are revealed) and the bandit setting (where only the loss of the action chosen by the player is revealed). We provide several algorithms addressing different variants of our setting, and provide tight regret bounds depending on combinatorial properties of the ...
Many well-studied online decision-making and learning models rely on the assumption that the environ...
We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under di...
International audienceWe propose a new partial-observability model for online learning problems wher...
We introduce and study a partial-information model of online learning, where a decision maker repeat...
We introduce and study a partial-information model of online learning, where a decision maker repeat...
We consider an adversarial online learning setting where a decision maker can choose an action in ev...
This dissertation considers a problem of online learning and online decision making where an agent o...
AbstractWe study a partial-information online-learning problem where actions are restricted to noisy...
We investigate a nonstochastic bandit setting in which the loss of an action is not immediately char...
In online learning, a player chooses actions to play and receives reward and feedback from the envir...
We study how to adapt to smoothly-varying (‘easy’) environments in well-known online learning proble...
We study the power of different types of adaptive (nonoblivious) adversaries in the setting of predi...
We study a partial-information online-learning problem where actions are restricted to noisy compar...
We consider a sequential learning problem with Gaussian payoffs and side in-formation: after selecti...
We study the problem of online multiclass classification in a setting where the learner’s feedback i...
Many well-studied online decision-making and learning models rely on the assumption that the environ...
We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under di...
International audienceWe propose a new partial-observability model for online learning problems wher...
We introduce and study a partial-information model of online learning, where a decision maker repeat...
We introduce and study a partial-information model of online learning, where a decision maker repeat...
We consider an adversarial online learning setting where a decision maker can choose an action in ev...
This dissertation considers a problem of online learning and online decision making where an agent o...
AbstractWe study a partial-information online-learning problem where actions are restricted to noisy...
We investigate a nonstochastic bandit setting in which the loss of an action is not immediately char...
In online learning, a player chooses actions to play and receives reward and feedback from the envir...
We study how to adapt to smoothly-varying (‘easy’) environments in well-known online learning proble...
We study the power of different types of adaptive (nonoblivious) adversaries in the setting of predi...
We study a partial-information online-learning problem where actions are restricted to noisy compar...
We consider a sequential learning problem with Gaussian payoffs and side in-formation: after selecti...
We study the problem of online multiclass classification in a setting where the learner’s feedback i...
Many well-studied online decision-making and learning models rely on the assumption that the environ...
We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under di...
International audienceWe propose a new partial-observability model for online learning problems wher...