In real-life decision environments people learn from their di-rect experience with alternative courses of action. Yet they can accelerate their learning by using functional knowledge about the features characterizing the alternatives. We designed a novel contextual multi-armed bandit task where decision makers chose repeatedly between multiple alternatives char-acterized by two informative features. We compared human behavior in this contextual task with a classic multi-armed bandit task without feature information. Behavioral analysis showed that participants in the contextual bandit task used the feature information to direct their exploration of promising alternatives. Ex post, we tested participants ’ acquired func-tional knowledge in o...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
In contextual bandits, an algorithm must choose actions given ob- served contexts, learning from a r...
An n-armed bandit task was used to investigate the trade-off between exploratory (choosing lesser-kn...
The authors introduce the contextual multi-armed bandit task as a framework to investigate learning ...
The ability to transfer knowledge across tasks and generalize to novel ones is an important hallmark...
<p>Poster presented at Reinforcement learning and decision making conference in 2015 in Edmonton, Ca...
How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, ...
Research in cognitive psychology regarding sequential decision-making usually involves tasks where a...
We study human learning & decision-making in tasks with probabilistic rewards. Recent studies in...
Presented as part of the ARC11 lecture on October 30, 2017 at 10:00 a.m. in the Klaus Advanced Compu...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
<p>This is raw data from the experiment described in the following article: Stojic, H., Analytis, P....
To what extent do human reward learning and decision-making rely on the ability to represent and gen...
International audienceA wealth of evidence in perceptual and economic decision-making research sugge...
An n- armed bandit task was used to investigate the trade-off between exploratory (...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
In contextual bandits, an algorithm must choose actions given ob- served contexts, learning from a r...
An n-armed bandit task was used to investigate the trade-off between exploratory (choosing lesser-kn...
The authors introduce the contextual multi-armed bandit task as a framework to investigate learning ...
The ability to transfer knowledge across tasks and generalize to novel ones is an important hallmark...
<p>Poster presented at Reinforcement learning and decision making conference in 2015 in Edmonton, Ca...
How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, ...
Research in cognitive psychology regarding sequential decision-making usually involves tasks where a...
We study human learning & decision-making in tasks with probabilistic rewards. Recent studies in...
Presented as part of the ARC11 lecture on October 30, 2017 at 10:00 a.m. in the Klaus Advanced Compu...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
<p>This is raw data from the experiment described in the following article: Stojic, H., Analytis, P....
To what extent do human reward learning and decision-making rely on the ability to represent and gen...
International audienceA wealth of evidence in perceptual and economic decision-making research sugge...
An n- armed bandit task was used to investigate the trade-off between exploratory (...
How people achieve long-term goals in an imperfectly known environment, via repeated tries and noisy...
In contextual bandits, an algorithm must choose actions given ob- served contexts, learning from a r...
An n-armed bandit task was used to investigate the trade-off between exploratory (choosing lesser-kn...