Interactively Learning Preference Constraints in Linear Bandits

Lindner, David
Tschiatschek, Sebastian
Hofmann, Katja
Krause, Andreas

Publication date

June 2022

Abstract

We study sequential decision-making with known rewards and unknown constraints, motivated by situations where the constraints represent expensive-to-evaluate human preferences, such as safe and comfortable driving behavior. We formalize the challenge of interactively learning about these constraints as a novel linear bandit problem which we call constrained linear best-arm identification. To solve this problem, we propose the Adaptive Constraint Learning (ACOL) algorithm. We provide an instance-dependent lower bound for constrained linear best-arm identification and show that ACOL's sample complexity matches the lower bound in the worst-case. In the average case, ACOL's sample complexity bound is still significantly tighter than bounds of s...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Interactively Learning Preference Constraints in Linear Bandits

Abstract

Extracted data

Interactively Learning Preference Constraints in Linear Bandits

Abstract

Extracted data

Related items

Related items