We study bandit problems in which a decision-maker gets reward-or-failure feedback when choosing repeat-edly between two alternatives, with fixed but unknown reward rates, over a short sequence of trials. We col-lect data across a number of types of bandit problems to analyze five heuristics—four seminal heuristics from machine learning, and one new model we develop—as models of human and optimal decision-making. We find that the new heuristic, known as τ-switch, which assumes a latent search state is followed by a latent stand state to control decision-making on key trials, is best able to mimic optimal decision-making, and best account for the decision-making of the majority of our experimental par-ticipants. We show how these results all...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
We study the problem of decision-making under uncertainty in the bandit setting. This thesis goes be...
We study bandit problems in which a decision-maker gets reward-or-failure feedback when choosing rep...
We consider a class of bandit problems in which a decision-maker must choose between a set of altern...
The bandit problem is a dynamic decision-making task that is simply described, well-suited to contro...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
We study human learning & decision-making in tasks with probabilistic rewards. Recent studies in...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
Research in cognitive psychology regarding sequential decision-making usually involves tasks where a...
Bandit problems provide an interesting and widely-used setting for the study of sequential decision-...
Bandit problems provide an interesting and widely-used setting for the study of sequential decision-...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
We study the problem of decision-making under uncertainty in the bandit setting. This thesis goes be...
We study bandit problems in which a decision-maker gets reward-or-failure feedback when choosing rep...
We consider a class of bandit problems in which a decision-maker must choose between a set of altern...
The bandit problem is a dynamic decision-making task that is simply described, well-suited to contro...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
We study human learning & decision-making in tasks with probabilistic rewards. Recent studies in...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
Research in cognitive psychology regarding sequential decision-making usually involves tasks where a...
Bandit problems provide an interesting and widely-used setting for the study of sequential decision-...
Bandit problems provide an interesting and widely-used setting for the study of sequential decision-...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
Inspired by advertising markets, we consider large-scale sequential decision making problems in whic...
We study the problem of decision-making under uncertainty in the bandit setting. This thesis goes be...