We consider an experimental setting in which a matching of resources to participants has to be chosen repeatedly and returns from the individual chosen matches are unknown, but can be learned. Our setting covers two-sided and one-sided matching with (potentially complex) capacity constraints, such as refugee resettlement, social housing allocation, and foster care. We propose a variant of the Thompson sampling algorithm to solve such adaptive combinatorial allocation problems. We give a tight, prior-independent, finite-sample bound on the expected regret for this algorithm. Although the number of allocations grows exponentially in the number of matches, our bound does not. In simulations based on refugee resettlement data using a Bayesian h...
Abstract We study an idealised sequential resource allocation problem. In each time step the learner...
We introduce an adaptive targeted treatment assignment methodology for field experiments. Our Temper...
Presented at the Thirty-eighth International Conference on Machine Learning (ICML 2021)International...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
The multi-armed bandit (MAB) problem provides a convenient abstraction for many online decision prob...
This paper considers the use of a simple posterior sampling algorithm to balance between exploration...
We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under ...
International audienceIn this paper we consider Thompson Sampling (TS) for combinatorial semi-bandit...
The multi-armed bandit problem is a popular model for studying exploration/exploitation trade-off in...
In this work, we address the combinatorial optimization problem in the stochastic bandit setting wit...
International audienceThe Thompson Sampling exhibits excellent results in practice and it has been s...
Behavioral scientists are increasingly able to conduct randomized experiments in settings that enabl...
We address the problem of online sequential decision making, i.e., balancing the trade-off between e...
Bandit algorithms such as Thompson Sampling (TS) have been put forth for decades as useful for condu...
A challenging aspect of the bandit problem is that a stochastic reward is observed only for the chos...
Abstract We study an idealised sequential resource allocation problem. In each time step the learner...
We introduce an adaptive targeted treatment assignment methodology for field experiments. Our Temper...
Presented at the Thirty-eighth International Conference on Machine Learning (ICML 2021)International...
International audienceWe investigate stochastic combinatorial multi-armed bandit with semi-bandit fe...
The multi-armed bandit (MAB) problem provides a convenient abstraction for many online decision prob...
This paper considers the use of a simple posterior sampling algorithm to balance between exploration...
We address multi-armed bandits (MAB) where the objective is to maximize the cumulative reward under ...
International audienceIn this paper we consider Thompson Sampling (TS) for combinatorial semi-bandit...
The multi-armed bandit problem is a popular model for studying exploration/exploitation trade-off in...
In this work, we address the combinatorial optimization problem in the stochastic bandit setting wit...
International audienceThe Thompson Sampling exhibits excellent results in practice and it has been s...
Behavioral scientists are increasingly able to conduct randomized experiments in settings that enabl...
We address the problem of online sequential decision making, i.e., balancing the trade-off between e...
Bandit algorithms such as Thompson Sampling (TS) have been put forth for decades as useful for condu...
A challenging aspect of the bandit problem is that a stochastic reward is observed only for the chos...
Abstract We study an idealised sequential resource allocation problem. In each time step the learner...
We introduce an adaptive targeted treatment assignment methodology for field experiments. Our Temper...
Presented at the Thirty-eighth International Conference on Machine Learning (ICML 2021)International...