Temporal difference (TD) learning methods (Sutton & Barto 1998) have become popular reinforcement learning techniques in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirable properties in theory, but have often been found slow in practice. This paper presents methods for further generalizing across tasks, thereby speeding up learning, via a novel form of behavior transfer. We compare learning on a complex task with three function approximators, a CMAC, a neural network, and an RBF, and demonstrate that behavior transfer works well with all three. Using behavior transfer, agents are able to learn one task ...
In this paper, we explore some issues associated with applying the Temporal Difference (TD) learning...
A key aspect of artificial intelligence is the ability to learn from experience. If examples of corr...
Reinforcement Learning has recently emerged as a viable solution for various sequential decision-mak...
Temporal difference (TD) learning methods (Sutton & Barto 1998) have become popular reinforcemen...
Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learni...
Agents, physical and virtual entities that interact with theirenvironment, are becoming increasingly...
The goal of transfer learning algorithms is to utilize knowledge gained in a source task to speed up...
Transfer learning is an inherent aspect of human learning. When humans learn to perform a task, we r...
The goal of transfer learning algorithms is to utilize knowledge gained in a source task to speed up...
This article addresses a particular Transfer Reinforcement Learning (RL) problem: when dynamics do n...
ABSTRACT Transfer learning refers to the process of conveying experience from a simple task to anoth...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
To avoid the curse of dimensionality, function approximators are used in reinforcement learning to ...
Combining reinforcement learning algorithms with function approximators in order to generalize over ...
In this paper, we explore some issues associated with applying the Temporal Difference (TD) learning...
A key aspect of artificial intelligence is the ability to learn from experience. If examples of corr...
Reinforcement Learning has recently emerged as a viable solution for various sequential decision-mak...
Temporal difference (TD) learning methods (Sutton & Barto 1998) have become popular reinforcemen...
Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learni...
Agents, physical and virtual entities that interact with theirenvironment, are becoming increasingly...
The goal of transfer learning algorithms is to utilize knowledge gained in a source task to speed up...
Transfer learning is an inherent aspect of human learning. When humans learn to perform a task, we r...
The goal of transfer learning algorithms is to utilize knowledge gained in a source task to speed up...
This article addresses a particular Transfer Reinforcement Learning (RL) problem: when dynamics do n...
ABSTRACT Transfer learning refers to the process of conveying experience from a simple task to anoth...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
To avoid the curse of dimensionality, function approximators are used in reinforcement learning to ...
Combining reinforcement learning algorithms with function approximators in order to generalize over ...
In this paper, we explore some issues associated with applying the Temporal Difference (TD) learning...
A key aspect of artificial intelligence is the ability to learn from experience. If examples of corr...
Reinforcement Learning has recently emerged as a viable solution for various sequential decision-mak...