Approximate information for efficient exploration-exploitation strategies

Barbier-Chebbah, Alex
Vestergaard, Christian L.
Masson, Jean-Baptiste

Publication date

July 2023

Language

English

Abstract

This paper addresses the exploration-exploitation dilemma inherent in decision-making, focusing on multi-armed bandit problems. The problems involve an agent deciding whether to exploit current knowledge for immediate gains or explore new avenues for potential long-term rewards. We here introduce a novel algorithm, approximate information maximization (AIM), which employs an analytical approximation of the entropy gradient to choose which arm to pull at each point in time. AIM matches the performance of Infomax and Thompson sampling while also offering enhanced computational speed, determinism, and tractability. Empirical evaluation of AIM indicates its compliance with the Lai-Robbins asymptotic bound and demonstrates its robustness for a r...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Approximate information for efficient exploration-exploitation strategies

Abstract

Extracted data

Approximate information for efficient exploration-exploitation strategies

Abstract

Extracted data

Related items

Related items