Nested bandits

Martin, Matthieu
Mertikopoulos, Panayotis
Rahier, Thibaud
Zenati, Houssam

Publication date

July 2022

Publisher

HAL CCSD

Abstract

35 pages, 14 figuresInternational audienceIn many online decision processes, the optimizing agent is called to choose between large numbers of alternatives with many inherent similarities; in turn, these similarities imply closely correlated losses that may confound standard discrete choice models and bandit algorithms. We study this question in the context of nested bandits, a class of adversarial multi-armed bandit problems where the learner seeks to minimize their regret in the presence of a large number of distinct alternatives with a hierarchy of embedded (non-combinatorial) similarities. In this setting, optimal algorithms based on the exponential weights blueprint (like Hedge, EXP3, and their variants) may incur significant regret be...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Nested bandits

Abstract

Extracted data

Nested bandits

Abstract

Extracted data

Related items

Related items