Parallelizing Exploration-Exploitation Tradeoffs in Gaussian Process Bandit Optimization

Desautels, Thomas
Krause, Andreas
Burdick, Joel W.

Publication date

December 2014

Publisher

Microtome Publishing

Abstract

How can we take advantage of opportunities for experimental parallelization in exploration-exploitation tradeoffs? In many experimental scenarios, it is often desirable to execute experiments simultaneously or in batches, rather than only performing one at a time. Additionally, observations may be both noisy and expensive. We introduce Gaussian Process Batch Upper Confidence Bound (GP-BUCB), an upper confidence bound-based algorithm, which models the reward function as a sample from a Gaussian process and which can select batches of experiments to run in parallel. We prove a general regret bound for GP-BUCB, as well as the surprising result that for some common kernels, the asymptotic average regret can be made independent of the batch size...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Parallelizing Exploration-Exploitation Tradeoffs in Gaussian Process Bandit Optimization

Abstract

Extracted data

Parallelizing Exploration-Exploitation Tradeoffs in Gaussian Process Bandit Optimization

Abstract

Extracted data

Related items

Related items