Many algorithms for approximate reinforcement learning are not known to converge. In fact, there are counterexamples showing that the adjustable weights in some algorithms may oscillate within a region rather than converging to a point. This paper shows that, for two popular algorithms, such oscillation is the worst that can happen: the weights cannot diverge, but instead must converge to a bounded region. The algorithms are SARSA(O) and V(O); the latter algorithm was used in the well-known TD-Gammon program.
The Zap stochastic approximation (SA) algorithm was introduced recently as a means to accelerate con...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
Reinforcement learning is defined as the problem of an agent that learns to perform a certain task t...
International audienceAlong with the sharp increase in visibility of the field, the rate at which ne...
AbstractThis work presents the restricted gradient-descent (RGD) algorithm, a training method for lo...
A number of reinforcement learning algorithms have been developed that are guaranteed to converge to...
A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide ...
Abstract: In order to scale to problems with large or continuous state-spaces, reinforcement learnin...
A key open problem in reinforcement learning is to assure convergence when using a compact hy-pothes...
Reinforcement learning is often done using parameterized function approximators to store value funct...
In order to solve realistic reinforcement learning problems, it is critical that approximate algor...
The application of reinforcement learning to problems with continuous domains requires representing ...
Abstract. Although tabular reinforcement learning (RL) methods have been proved to converge to an op...
Reinforcement learning deals with the problem of sequential decision making in uncertain stochastic ...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
The Zap stochastic approximation (SA) algorithm was introduced recently as a means to accelerate con...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
Reinforcement learning is defined as the problem of an agent that learns to perform a certain task t...
International audienceAlong with the sharp increase in visibility of the field, the rate at which ne...
AbstractThis work presents the restricted gradient-descent (RGD) algorithm, a training method for lo...
A number of reinforcement learning algorithms have been developed that are guaranteed to converge to...
A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide ...
Abstract: In order to scale to problems with large or continuous state-spaces, reinforcement learnin...
A key open problem in reinforcement learning is to assure convergence when using a compact hy-pothes...
Reinforcement learning is often done using parameterized function approximators to store value funct...
In order to solve realistic reinforcement learning problems, it is critical that approximate algor...
The application of reinforcement learning to problems with continuous domains requires representing ...
Abstract. Although tabular reinforcement learning (RL) methods have been proved to converge to an op...
Reinforcement learning deals with the problem of sequential decision making in uncertain stochastic ...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
The Zap stochastic approximation (SA) algorithm was introduced recently as a means to accelerate con...
Many interesting problems in reinforcement learning (RL) are continuous and/or high dimensional, and...
Reinforcement learning is defined as the problem of an agent that learns to perform a certain task t...