This paper proposes a new regularization technique for reinforcement learning (RL) towards making policy and value functions smooth and stable. RL is known for the instability of the learning process and the sensitivity of the acquired policy to noise. Several methods have been proposed to resolve these problems, and in summary, the smoothness of policy and value functions learned mainly in RL contributes to these problems. However, if these functions are extremely smooth, their expressiveness would be lost, resulting in not obtaining the global optimal solution. This paper therefore considers RL under local Lipschitz continuity constraint, so-called L2C2. By designing the spatio-temporal locally compact space for L2C2 from the state transi...
Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. With...
The successful application of Reinforcement Learning (RL) techniques to robot control is limited by ...
Abstract-Reinforcement learning with linear and non-linear function approximation has been studied e...
Recent advances in machine learning, simulation, algorithm design, and computer hardware have allowe...
Dynamic control tasks are good candidates for the application of reinforcement learning techniques. ...
This paper proposes a relaxed control regularization with general exploration rewards to design robu...
This is the version of record. It originally appeared on arXiv at http://arxiv.org/abs/1603.00748.Mo...
Abstract — Reinforcement learning with linear and non-linear function approximation has been studied...
Recently, Petrik et al. demonstrated that L1-Regularized Approximate Linear Programming (RALP) could...
Feature representation is critical not only for pattern recognition tasks but also for reinforcement...
© The Author(s) 2020. We propose a novel framework for learning stabilizable nonlinear dynamical sys...
Despite the recent success of reinforcement learning in various domains, these approaches remain, fo...
Approximate Reinforcement Learning (RL) is a method to solve sequential decisionmaking and dynamic c...
Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before depl...
Trajectory-Centric Reinforcement Learning and Trajectory Optimization methods optimize a sequence of...
Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. With...
The successful application of Reinforcement Learning (RL) techniques to robot control is limited by ...
Abstract-Reinforcement learning with linear and non-linear function approximation has been studied e...
Recent advances in machine learning, simulation, algorithm design, and computer hardware have allowe...
Dynamic control tasks are good candidates for the application of reinforcement learning techniques. ...
This paper proposes a relaxed control regularization with general exploration rewards to design robu...
This is the version of record. It originally appeared on arXiv at http://arxiv.org/abs/1603.00748.Mo...
Abstract — Reinforcement learning with linear and non-linear function approximation has been studied...
Recently, Petrik et al. demonstrated that L1-Regularized Approximate Linear Programming (RALP) could...
Feature representation is critical not only for pattern recognition tasks but also for reinforcement...
© The Author(s) 2020. We propose a novel framework for learning stabilizable nonlinear dynamical sys...
Despite the recent success of reinforcement learning in various domains, these approaches remain, fo...
Approximate Reinforcement Learning (RL) is a method to solve sequential decisionmaking and dynamic c...
Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before depl...
Trajectory-Centric Reinforcement Learning and Trajectory Optimization methods optimize a sequence of...
Reinforcement learning (RL) is promising for complicated stochastic nonlinear control problems. With...
The successful application of Reinforcement Learning (RL) techniques to robot control is limited by ...
Abstract-Reinforcement learning with linear and non-linear function approximation has been studied e...