How do people decide whether to try out novel options as opposed to tried-and-testedones? We argue that they infer a novel option’s reward from contextual informationlearned from functional relations and take uncertainty into account when making adecision. We propose a Bayesian optimization model to describe their learning and decisionmaking. This model relies on similarity-based learning of functional relationships betweenfeatures and rewards, and a choice rule that balances exploration and exploitation bycombining predicted rewards and the uncertainty of these predictions. Our model makestwo main predictions. First, decision makers who learn functional relationships willgeneralize based on the learned reward function, choosing novel optio...
Computational models of learning have proved largely successful in characterizing potential mechanis...
An important use of machine learning is to learn what people value. What posts or photos should a us...
The authors introduce the contextual multi-armed bandit task as a framework to investigate learning ...
How do people decide whether to try out novel options as opposed to tried-and-tested ones? We argue ...
Reinforcement learning algorithms have provided useful insights into human and an- imal learning and...
In repeated decision problems for which it is possible to learn from experience, people should activ...
Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, ...
In reinforcement learning (RL), a decision maker searching for the most rewarding option is often fa...
In this paper we computationally examine how subjective experience may help or harm the decision mak...
In this paper we computationally examine how subjec-tive experience may help or harm the decision ma...
none1noThis paper identifies the globally stable conditions under which an individual facing the sam...
How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, ...
Successful behaviour depends on the right balance between maximising reward and soliciting informati...
Many types of intelligent behavior can be framed as a search problem, where an individual must explo...
Aim: The nature of attention, and how it interacts with learning and choice processes in the context...
Computational models of learning have proved largely successful in characterizing potential mechanis...
An important use of machine learning is to learn what people value. What posts or photos should a us...
The authors introduce the contextual multi-armed bandit task as a framework to investigate learning ...
How do people decide whether to try out novel options as opposed to tried-and-tested ones? We argue ...
Reinforcement learning algorithms have provided useful insights into human and an- imal learning and...
In repeated decision problems for which it is possible to learn from experience, people should activ...
Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, ...
In reinforcement learning (RL), a decision maker searching for the most rewarding option is often fa...
In this paper we computationally examine how subjective experience may help or harm the decision mak...
In this paper we computationally examine how subjec-tive experience may help or harm the decision ma...
none1noThis paper identifies the globally stable conditions under which an individual facing the sam...
How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, ...
Successful behaviour depends on the right balance between maximising reward and soliciting informati...
Many types of intelligent behavior can be framed as a search problem, where an individual must explo...
Aim: The nature of attention, and how it interacts with learning and choice processes in the context...
Computational models of learning have proved largely successful in characterizing potential mechanis...
An important use of machine learning is to learn what people value. What posts or photos should a us...
The authors introduce the contextual multi-armed bandit task as a framework to investigate learning ...