Online learning with expert advice and finitehorizon constraints

Branislav Kveton
Jia Yuan Yu
Georgios Theocharous
Shie Mannor

Publication date

January 2008

Abstract

In this paper, we study a sequential decision making problem. The objective is to maximize the average reward accumulated over time subject to temporal cost constraints. The novelty of our setup is that the rewards and constraints are controlled by an adverse opponent. To solve our problem in a practical way, we propose an expert algorithm that guarantees both a vanish-ing regret and a sublinear number of violated constraints. The quality of this solution is demonstrated on a real-world power management problem. Our results support the hypothesis that online learning with convex cost constraints can be performed successfully in practice

Extracted data

We use cookies to provide a better user experience.

Data Protection

Online learning with expert advice and finitehorizon constraints

Abstract

Extracted data

Online learning with expert advice and finitehorizon constraints

Abstract

Extracted data

Related items

Related items