Restless bandits with switching costs: Linear programming relaxations, performance bounds and limited lookahead policies

Jerome Le Ny
Eric Feron

Publication date

January 2006

DOI

10.1109/acc.2006.1656445

Abstract

Abstract—The multi-armed bandit problem and one of its most interesting extensions, the restless bandits problem, are frequently encountered in various stochastic control problems. We present a linear programming relaxation for the restless bandits problem with discounted rewards, where only one project can be activated at each period but with additional costs penalizing switching between projects. The relaxation can be efficiently computed and provides a bound on the achievable performance. We describe several heuristic policies; in particular, we show that a policy adapted from the primal-dual heuristic of Bertsimas and Niño-Mora [1] for the classical restless bandits problem is in fact equivalent to a one-step lookahead policy; thus, th...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Restless bandits with switching costs: Linear programming relaxations, performance bounds and limited lookahead policies

Abstract

Extracted data

Restless bandits with switching costs: Linear programming relaxations, performance bounds and limited lookahead policies

Abstract

Extracted data

Related items

Related items