This paper focuses on the efficiency improvement of online actor-critic design base on the Levenberg-Marquardt (LM) algorithm rather than traditional chain rule. Over the decades, several generations of adaptive/approximate dynamic programming (ADP) structures have been proposed in the community and demonstrated many successfully applications. Neural network with backpropagation has been one of the most important approaches to tune the parameters in such ADP designs. In this paper, we aim to study the integration of Levenberg-Marquardt method into the regular actor-critic design to improve weights updating and learning for a quadratic convergence under certain condition. Specifically, for the critic network design, we adopt the LM method ta...
This paper discusses strategies for and details of training procedures for the dual heuristic progra...
This paper discusses strategies for and details of training procedures for the dual heuristic progra...
This paper deals with reinforcement lear ning for process modeling and control using a model-free, ...
In this paper we propose to integrate the recursive Levenberg-Marquardt method into the adaptive dyn...
Adaptive Dynamic Programming (ADP) with critic-actor architecture is an effective way to perform onl...
Cover: Saturated policy for the pendulum swing-up problem as learned by the model learning actor-cri...
In this paper, we propose a novel strategy for approximating policy evaluation during online critic-...
Abstract—Policy gradient based actor-critic algorithms are amongst the most popular algorithms in th...
Classical control theory requires a model to be derived for a system, before any control design can ...
Classical control theory requires a model to be derived for a system, before any control design can ...
Abstract—In this paper, we analyze a class of actor-critic algorithms under partially observable Mar...
In this paper, a novel reinforcement learning neural network (NN)-based controller, referred to adap...
In this paper, we propose a novel adaptive dynamic programming (ADP) architecture with three network...
Abstract—This paper deals with reinforcement lear ning for process modeling and control using a mode...
The reinforcement learning (RL) framework enables to construct controllers that try to find find an ...
This paper discusses strategies for and details of training procedures for the dual heuristic progra...
This paper discusses strategies for and details of training procedures for the dual heuristic progra...
This paper deals with reinforcement lear ning for process modeling and control using a model-free, ...
In this paper we propose to integrate the recursive Levenberg-Marquardt method into the adaptive dyn...
Adaptive Dynamic Programming (ADP) with critic-actor architecture is an effective way to perform onl...
Cover: Saturated policy for the pendulum swing-up problem as learned by the model learning actor-cri...
In this paper, we propose a novel strategy for approximating policy evaluation during online critic-...
Abstract—Policy gradient based actor-critic algorithms are amongst the most popular algorithms in th...
Classical control theory requires a model to be derived for a system, before any control design can ...
Classical control theory requires a model to be derived for a system, before any control design can ...
Abstract—In this paper, we analyze a class of actor-critic algorithms under partially observable Mar...
In this paper, a novel reinforcement learning neural network (NN)-based controller, referred to adap...
In this paper, we propose a novel adaptive dynamic programming (ADP) architecture with three network...
Abstract—This paper deals with reinforcement lear ning for process modeling and control using a mode...
The reinforcement learning (RL) framework enables to construct controllers that try to find find an ...
This paper discusses strategies for and details of training procedures for the dual heuristic progra...
This paper discusses strategies for and details of training procedures for the dual heuristic progra...
This paper deals with reinforcement lear ning for process modeling and control using a model-free, ...