Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plas-ticity is influenced by an environmental signal, termed a reward, which directs the changes in appropriate directions. We apply a recently introduced policy learning algorithm from Machine Learning to networks of spiking neurons, and derive a spike time dependent plasticity rule which ensures convergence to a local optimum of the expected average reward. The approach is applicable to a broad class of neuronal models, including the Hodgkin-Huxley model. We demonstrate the effectiveness of the derived rule in several toy problems. Finally, through statistical analysis we show ...
Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions....
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brai...
Changes of synaptic connections between neurons are thought to be the physiological basis of learnin...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
The persistent modification of synaptic efficacy as a function of the rela-tive timing of pre- and p...
Biological neurons communicate primarily via a spiking process. Recurrently connected spiking neural...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
How do animals learn to repeat behaviors that lead to the obtention of food or other “rewarding” obj...
Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is...
Recent experiments have shown that spike-timing-dependent plasticity is influenced by neuromodulatio...
Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a ...
Although it is widely believed that reinforcement learning is a suitable tool for describing behavio...
Reward-modulated spike timing dependent plasticity (STDP) combines unsupervised STDP with a reinforc...
Reward-modulated spike timing dependent plasticity (STDP) combines unsupervised STDP with a reinforc...
Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions....
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brai...
Changes of synaptic connections between neurons are thought to be the physiological basis of learnin...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
The persistent modification of synaptic efficacy as a function of the rela-tive timing of pre- and p...
Biological neurons communicate primarily via a spiking process. Recurrently connected spiking neural...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
How do animals learn to repeat behaviors that lead to the obtention of food or other “rewarding” obj...
Learning by reinforcement is important in shaping animal behavior. But behavioral decision making is...
Recent experiments have shown that spike-timing-dependent plasticity is influenced by neuromodulatio...
Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a ...
Although it is widely believed that reinforcement learning is a suitable tool for describing behavio...
Reward-modulated spike timing dependent plasticity (STDP) combines unsupervised STDP with a reinforc...
Reward-modulated spike timing dependent plasticity (STDP) combines unsupervised STDP with a reinforc...
Population coding is widely regarded as a key mechanism for achieving reliable behavioral decisions....
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brai...