Multi-armed bandit problem is an important optimization game that requires an exploration-exploitation tradeoff to achieve optimal total reward. Motivated from industrial applications such as online advertising and clinical research, we consider a setting where the rewards of bandit machines are associated with covariates, and the accurate estimation of the corresponding mean reward functions plays an important role in the performance of allocation rules. Under a flexible problem setup, we establish asymptotic strong consistency and perform a finite-time regret analysis for a sequential randomized allocation strategy based on kernel estimation. In addition, since many nonparametric and parametric methods in supervised learning may be applie...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
International audienceAlgorithms based on upper-confidence bounds for balancing exploration and expl...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
University of Minnesota Ph.D. dissertation. July 2014. Major: Statistics. Advisor: Yuhong Yang. 1 co...
We study a multi-armed bandit problem in a setting where covariates are available. We take a nonpara...
This paper studies an important sequential decision making problem known as the multi-armed stochast...
We consider a bandit problem which involves sequential sampling from two populations (arms). Each ar...
The Multi-armed Bandit (MAB) problem is a classic example of the exploration-exploitation dilemma. I...
Consider a Bayesian sequential allocation problem that incorporates a covariate. The goal is to maxi...
We consider a multi-armed bandit problem in a setting where each arm produces a noisy reward realiza...
We survey the literature on multi-armed bandit models and their applications in economics. The multi...
This manuscript deals with the estimation of the optimal rule and its meanreward in a simple ban...
University of Minnesota Ph.D. dissertation. May 2020. Major: Statistics. Advisor: Yuhong Yang. 1 com...
Algorithms based on upper confidence bounds for balancing exploration and exploitation are gaining p...
Many sequential decision making problems require an agent to balance exploration and exploitation to...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
International audienceAlgorithms based on upper-confidence bounds for balancing exploration and expl...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...
University of Minnesota Ph.D. dissertation. July 2014. Major: Statistics. Advisor: Yuhong Yang. 1 co...
We study a multi-armed bandit problem in a setting where covariates are available. We take a nonpara...
This paper studies an important sequential decision making problem known as the multi-armed stochast...
We consider a bandit problem which involves sequential sampling from two populations (arms). Each ar...
The Multi-armed Bandit (MAB) problem is a classic example of the exploration-exploitation dilemma. I...
Consider a Bayesian sequential allocation problem that incorporates a covariate. The goal is to maxi...
We consider a multi-armed bandit problem in a setting where each arm produces a noisy reward realiza...
We survey the literature on multi-armed bandit models and their applications in economics. The multi...
This manuscript deals with the estimation of the optimal rule and its meanreward in a simple ban...
University of Minnesota Ph.D. dissertation. May 2020. Major: Statistics. Advisor: Yuhong Yang. 1 com...
Algorithms based on upper confidence bounds for balancing exploration and exploitation are gaining p...
Many sequential decision making problems require an agent to balance exploration and exploitation to...
International audienceWe consider the problem of finding the best arm in a stochastic multi-armed ba...
International audienceAlgorithms based on upper-confidence bounds for balancing exploration and expl...
In this thesis, we study strategies for sequential resource allocation, under the so-called stochast...