Online learning algorithms are designed to learn even when their input is generated by an adversary. The widely-accepted formal definition of an online algorithm’s ability to learn is the game-theoretic notion of regret. We argue that the standard definition of re-gret becomes inadequate if the adversary is allowed to adapt to the online algorithm’s ac-tions. We define the alternative notion of pol-icy regret, which attempts to provide a more meaningful way to measure an online algo-rithm’s performance against adaptive adver-saries. Focusing on the online bandit set-ting, we show that no bandit algorithm can guarantee a sublinear policy regret against an adaptive adversary with unbounded mem-ory. On the other hand, if the adversary’s memory...
We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds...
We study online learnability of a wide class of problems, extending the results of [25] to general n...
We consider a collaborative online learning paradigm, wherein a group of agents connected through a ...
We study the power of different types of adaptive (nonoblivious) adversaries in the setting of predi...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
We study the problem of online learning with a notion of regret defined with respect to a set of str...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
We study a new class of online learning problems where each of the online algorithm’s actions is ass...
Online learning or sequential decision making is formally defined as a repeated game between an adve...
The greedy algorithm is extensively studied in the field of combinatorial optimiza-tion for decades....
We demonstrate a modification of the algorithm of Dani et al for the online linear optimization prob...
We study the online learning model: a widely applicable model for making repeated choices in an inte...
Much of modern learning theory has been split between two regimes: the classical offline setting, wh...
We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds...
We study online learnability of a wide class of problems, extending the results of [25] to general n...
We consider a collaborative online learning paradigm, wherein a group of agents connected through a ...
We study the power of different types of adaptive (nonoblivious) adversaries in the setting of predi...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
We study the problem of online learning with a notion of regret defined with respect to a set of str...
External regret compares the performance of an online algorithm, selecting among N actions, to the p...
We study a new class of online learning problems where each of the online algorithm’s actions is ass...
Online learning or sequential decision making is formally defined as a repeated game between an adve...
The greedy algorithm is extensively studied in the field of combinatorial optimiza-tion for decades....
We demonstrate a modification of the algorithm of Dani et al for the online linear optimization prob...
We study the online learning model: a widely applicable model for making repeated choices in an inte...
Much of modern learning theory has been split between two regimes: the classical offline setting, wh...
We present a learning algorithm for undiscounted reinforcement learning. Our interest lies in bounds...
We study online learnability of a wide class of problems, extending the results of [25] to general n...
We consider a collaborative online learning paradigm, wherein a group of agents connected through a ...