Bandit based methods for tree search have recently gained popularity when applied to huge trees, e.g. in the game of go [6]. Their efficient exploration of the tree enables to re-turn rapidly a good value, and improve preci-sion if more time is provided. The UCT algo-rithm [8], a tree search method based on Up-per Confidence Bounds (UCB) [2], is believed to adapt locally to the effective smoothness of the tree. However, we show that UCT is “over-optimistic ” in some sense, leading to a worst-case regret that may be very poor. We propose alternative bandit algorithms for tree search. First, a modification of UCT us-ing a confidence sequence that scales expo-nentially in the horizon depth is analyzed. We then consider Flat-UCB performed on th...
International audienceWe present a new exploration term, more efficient than clas- sical UCT-like ex...
International audienceWe consider a search problem on trees in which an agent starts at the root of ...
International audienceIn black-box optimization problems, we aim to maximize an unknown objective fu...
Bandit based methods for tree search have recently gained popularity when applied to huge trees, e.g...
Bandit based methods for tree search have recently gained popularity when applied to huge trees, e.g...
Abstract—The application of multi-armed bandit (MAB) algo-rithms was a critical step in the developm...
Recent advances in bandit tools and techniques for sequential learning are steadily enabling new ap...
The Upper Confidence bounds for Trees (UCT) algorithm has in recent years captured the attention of ...
textabstractRecent advances in bandit tools and techniques for sequential learning are steadily enab...
International audienceUpper Confidence Trees are a very efficient tool for solving Markov Decision P...
130 pagesThis work covers several aspects of the optimism in the face of uncertainty principle appli...
The Upper Confidence bounds for Trees (UCT) algorithm has in recent years captured the attention of ...
Upper Confidence bounds applied to Trees (UCT), a bandit-based Monte-Carlo sampling algorithm for pl...
International audienceWe present a new exploration term, more efficient than clas- sical UCT-like ex...
International audienceWe consider a search problem on trees in which an agent starts at the root of ...
International audienceIn black-box optimization problems, we aim to maximize an unknown objective fu...
Bandit based methods for tree search have recently gained popularity when applied to huge trees, e.g...
Bandit based methods for tree search have recently gained popularity when applied to huge trees, e.g...
Abstract—The application of multi-armed bandit (MAB) algo-rithms was a critical step in the developm...
Recent advances in bandit tools and techniques for sequential learning are steadily enabling new ap...
The Upper Confidence bounds for Trees (UCT) algorithm has in recent years captured the attention of ...
textabstractRecent advances in bandit tools and techniques for sequential learning are steadily enab...
International audienceUpper Confidence Trees are a very efficient tool for solving Markov Decision P...
130 pagesThis work covers several aspects of the optimism in the face of uncertainty principle appli...
The Upper Confidence bounds for Trees (UCT) algorithm has in recent years captured the attention of ...
Upper Confidence bounds applied to Trees (UCT), a bandit-based Monte-Carlo sampling algorithm for pl...
International audienceWe present a new exploration term, more efficient than clas- sical UCT-like ex...
International audienceWe consider a search problem on trees in which an agent starts at the root of ...
International audienceIn black-box optimization problems, we aim to maximize an unknown objective fu...