Pareto upper confidence bounds algorithms: an empirical study

Drugan, M.M.
Nowé, A.
Manderick, B.

Open link

Publication date

January 2014

DOI

10.1109/adprl.2014.7010620

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Abstract

Many real-world stochastic environments are inherently multi-objective environments with conflicting objectives. The multi-objective multi-armed bandits (MOMAB) are extensions of the classical, i.e. single objective, multi-armed bandits to reward vectors and multi-objective optimisation techniques are often required to design mechanisms with an efficient exploration / exploitation trade-off. In this paper, we propose the improved Pareto Upper Confidence Bound (iPUCB) algorithm that straightforwardly extends the single objective improved UCB algorithm to reward vectors by deleting the suboptimal arms. The goal of the improved Pareto UCB algorithm, i.e. iPUCB, is to identify the set of best arms, or the Pareto front, in a fixed budget of arm ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Pareto upper confidence bounds algorithms: an empirical study

Abstract

Extracted data

Pareto upper confidence bounds algorithms: an empirical study

Abstract

Extracted data

Related items

Related items