We study learning in a bandit task in which the outcome probabilities of six arms switch (“jump”) over time. In the experiment, optimal Bayesian learning tracks the jumps by discovering the probability of a jump or direct jump detection. Although more complex than the natural alternative of learning, through adaptive expectations, when combined with a partially myopic decision rule, Bayesian learning better matches the behavior observed in the lab. This result suggests that agents might be less limited in their computational capacities than previously thought, and that complexity does not always hamper fully rational learning
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
A multi-armed bandit problem models an agent that simultaneously attempts to acquire new information...
Imitation learning has been widely used to speed up learning in novice agents, by allowing them to l...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
Recently, evidence has emerged that humans approach learning using Bayesian updating rather than (mo...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
An important issue in financial decision-making is the way people process new information. Prior stu...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
How do people learn? We assess, in a model-free manner, subjectsʼ belief dynamics in a two-armed ban...
The two-armed bandit problem is a classical optimization problem where a player sequentially selects...
Neoclassical finance assumes that investors are Bayesian. In many realistic situations, Bayesian lea...
We explore a new model of bandit experiments where a potentially nonstationary sequence of contexts ...
My research attempts to address on-line action selection in reinforcement learning from a Bayesian p...
The bandit problem is a dynamic decision-making task that is simply described, well-suited to contro...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
A multi-armed bandit problem models an agent that simultaneously attempts to acquire new information...
Imitation learning has been widely used to speed up learning in novice agents, by allowing them to l...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
Recently, evidence has emerged that humans approach learning using Bayesian updating rather than (mo...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
An important issue in financial decision-making is the way people process new information. Prior stu...
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observ...
How do people learn? We assess, in a model-free manner, subjectsʼ belief dynamics in a two-armed ban...
The two-armed bandit problem is a classical optimization problem where a player sequentially selects...
Neoclassical finance assumes that investors are Bayesian. In many realistic situations, Bayesian lea...
We explore a new model of bandit experiments where a potentially nonstationary sequence of contexts ...
My research attempts to address on-line action selection in reinforcement learning from a Bayesian p...
The bandit problem is a dynamic decision-making task that is simply described, well-suited to contro...
Abstract—We present a formal model of human decision-making in explore-exploit tasks using the conte...
A multi-armed bandit problem models an agent that simultaneously attempts to acquire new information...
Imitation learning has been widely used to speed up learning in novice agents, by allowing them to l...