Episodic control enables sample efficiency in reinforcement learning by recalling past experiences from an episodic memory. We propose a new model-based episodic memory of trajectories addressing current limitations of episodic control. Our memory estimates trajectory values, guiding the agent towards good policies. Built upon the memory, we construct a complementary learning model via a dynamic hybrid control unifying model-based, episodic and habitual learning into a single architecture. Experiments demonstrate that our model allows significantly faster and better learning than other strong reinforcement learning agents across a variety of environments including stochastic and non-Markovian settings
Abstract: Reinforcement learning systems usually assume that a value function is defined over all st...
Memory is a pillar of intelligence, and to think like us, it may be that artificial systems must rem...
Non-parametric episodic memory can be used to quickly latch onto high-reward experience in reinforce...
Episodic control enables sample efficiency in reinforcement learning by recalling past experiences f...
International audienceA longstanding goal in reinforcement learning is to build intelligent agents t...
Treball fi de màster de: Master in Cognitive Systems and Interactive MediaDirectors: Ismael T. Freir...
Recently, neuro-inspired episodic control (EC) methods have been developed to overcome the data-inef...
Reinforcement learning systems usually assume that a value function is defined over all states (or s...
Elements from cognitive psychology have been applied in a variety of ways to artificial intelligence...
In continual learning (CL), an agent learns from a stream of tasks leveraging prior experience to tr...
International audienceAugmenting the representation of the current state of the external world with ...
Modern deep reinforcement learning (RL) algorithms, despite being at the forefront of artificial int...
This paper presents a neural model that learns episodic traces in response to a continuous stream of...
Reinforcement learning in non-stationary environments is generally regarded as a very difficult prob...
For decades, neuroscientists and psychologists have observed that animal performance on spatial navi...
Abstract: Reinforcement learning systems usually assume that a value function is defined over all st...
Memory is a pillar of intelligence, and to think like us, it may be that artificial systems must rem...
Non-parametric episodic memory can be used to quickly latch onto high-reward experience in reinforce...
Episodic control enables sample efficiency in reinforcement learning by recalling past experiences f...
International audienceA longstanding goal in reinforcement learning is to build intelligent agents t...
Treball fi de màster de: Master in Cognitive Systems and Interactive MediaDirectors: Ismael T. Freir...
Recently, neuro-inspired episodic control (EC) methods have been developed to overcome the data-inef...
Reinforcement learning systems usually assume that a value function is defined over all states (or s...
Elements from cognitive psychology have been applied in a variety of ways to artificial intelligence...
In continual learning (CL), an agent learns from a stream of tasks leveraging prior experience to tr...
International audienceAugmenting the representation of the current state of the external world with ...
Modern deep reinforcement learning (RL) algorithms, despite being at the forefront of artificial int...
This paper presents a neural model that learns episodic traces in response to a continuous stream of...
Reinforcement learning in non-stationary environments is generally regarded as a very difficult prob...
For decades, neuroscientists and psychologists have observed that animal performance on spatial navi...
Abstract: Reinforcement learning systems usually assume that a value function is defined over all st...
Memory is a pillar of intelligence, and to think like us, it may be that artificial systems must rem...
Non-parametric episodic memory can be used to quickly latch onto high-reward experience in reinforce...