Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting ‘‘micro-Agents’’, each of which has a separate discounting factor (c). Each mAgent maintains an independent hypothesis about the stat...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
An open problem in the field of computational neuroscience is how to link synaptic plasticity to sys...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We ...
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We ...
Recent theoretical and experimental results suggest that the dopamine system implements distribution...
Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selection pol...
We introduce a generalization of temporal-difference (TD) learning to networks of interrelated predi...
Temporal-difference (TD) learning can be used not just to predict rewards, as is commonly done in re...
Humans and animals are capable of evaluating actions by considering their long-run future rewards th...
Humans and animals are capable of evaluating actions by considering their long-run future rewards th...
Temporal difference (TD) methods constitute a class of methods for learning predictions in multi-ste...
This chapter presents a model of classical conditioning called the temporaldifference (TD) model. Th...
The temporal difference (TD) learning framework is a major paradigm for understanding value-based de...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
An open problem in the field of computational neuroscience is how to link synaptic plasticity to sys...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We ...
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We ...
Recent theoretical and experimental results suggest that the dopamine system implements distribution...
Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selection pol...
We introduce a generalization of temporal-difference (TD) learning to networks of interrelated predi...
Temporal-difference (TD) learning can be used not just to predict rewards, as is commonly done in re...
Humans and animals are capable of evaluating actions by considering their long-run future rewards th...
Humans and animals are capable of evaluating actions by considering their long-run future rewards th...
Temporal difference (TD) methods constitute a class of methods for learning predictions in multi-ste...
This chapter presents a model of classical conditioning called the temporaldifference (TD) model. Th...
The temporal difference (TD) learning framework is a major paradigm for understanding value-based de...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
Animals repeat rewarded behaviors, but the physiological basis of reward-based learning has only bee...
An open problem in the field of computational neuroscience is how to link synaptic plasticity to sys...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...