a. The reward sensitivity beta scales how action weights (i.e., a combination of estimated probability and potential reward value) are translated into choices. Higher reward sensitivities translate to more deterministic choices (i.e., exploitation), whereas lower reward sensitivities lead to more random choices (i.e., exploration). b. The learning rate alpha captures how quickly estimated win probabilities are updated if new information is available. High learning rates (upper panel) lead to fast updates and quick forgetting of long-term outcomes. The black line depicts the latent win probability, while the points depict the estimated win probability based on the reinforcement learning model. c. Weighting of the estimated win probability of...
Animals and humans often have to choose between options with reward distributions that are initially...
<p>Each panel shows the difference between the values of the optimal and non-optimal options, as a f...
A. Choices were more random for more noisy reward distributions (i.e. high values of βj) and for mea...
(A): During learning, 3 option pairs were presented in random order. Participants had to select the ...
a. Simulation of N = 50,000 players shows high rewards for different combinations of learning rate f...
Average accuracy and RT across subjects (N = 34) as a function of option pairs in the learning phase...
a. Learning rate for wins, αwin, does not change over runs (b = -0.006, p = .19). b. Learning rate f...
The ability to make optimal decisions depends on evaluating the expected rewards associated with dif...
<p> <b>(A)</b> The switch from matching shoulders (MS) to rising optimum (RO) r...
(A) The contribution of serial hypothesis testing (SHT) was inversely correlated with reaction time ...
AbstractReinforcement learning (RL) models have been widely used to analyze the choice behavior of h...
<div><p>Measurements of response time (RT) have long been used to infer neural processes underlying ...
When making repeated decisions, individuals can learn about associations between actions and outcome...
<p>(<i>A</i>) Bayesian Information Criterion scores for each model (a low score is better). Models b...
Measurements of response time (RT) have long been used to infer neural processes underlying various ...
Animals and humans often have to choose between options with reward distributions that are initially...
<p>Each panel shows the difference between the values of the optimal and non-optimal options, as a f...
A. Choices were more random for more noisy reward distributions (i.e. high values of βj) and for mea...
(A): During learning, 3 option pairs were presented in random order. Participants had to select the ...
a. Simulation of N = 50,000 players shows high rewards for different combinations of learning rate f...
Average accuracy and RT across subjects (N = 34) as a function of option pairs in the learning phase...
a. Learning rate for wins, αwin, does not change over runs (b = -0.006, p = .19). b. Learning rate f...
The ability to make optimal decisions depends on evaluating the expected rewards associated with dif...
<p> <b>(A)</b> The switch from matching shoulders (MS) to rising optimum (RO) r...
(A) The contribution of serial hypothesis testing (SHT) was inversely correlated with reaction time ...
AbstractReinforcement learning (RL) models have been widely used to analyze the choice behavior of h...
<div><p>Measurements of response time (RT) have long been used to infer neural processes underlying ...
When making repeated decisions, individuals can learn about associations between actions and outcome...
<p>(<i>A</i>) Bayesian Information Criterion scores for each model (a low score is better). Models b...
Measurements of response time (RT) have long been used to infer neural processes underlying various ...
Animals and humans often have to choose between options with reward distributions that are initially...
<p>Each panel shows the difference between the values of the optimal and non-optimal options, as a f...
A. Choices were more random for more noisy reward distributions (i.e. high values of βj) and for mea...