Infinitely many-armed bandits

Wang, Yizao
Audibert, Jean-Yves
Munos, Rémi

Publication date

January 2008

Publisher

HAL CCSD

Abstract

International audienceWe consider multi-armed bandit problems where the number of arms is larger than the possible number of experiments. We make a stochastic assumption on the mean-reward of a new selected arm which characterizes its probability of being a near-optimal arm. Our assumption is weaker than in previous works. We describe algorithms based on upper-confidence-bounds applied to a restricted set of randomly selected arms and provide upper-bounds on the resulting expected regret. We also derive a lower-bound which matches (up to a logarithmic factor) the upper-bound in some cases

Extracted data

We use cookies to provide a better user experience.

Data Protection

Infinitely many-armed bandits

Abstract

Extracted data

Infinitely many-armed bandits

Abstract

Extracted data

Related items

Related items