Though deep reinforcement learning (DRL) has obtained substantial success, it may encounter catastrophic failures due to the intrinsic uncertainty of both transition and observation. Most of the existing methods for safe reinforcement learning can only handle transition disturbance or observation disturbance since these two kinds of disturbance affect different parts of the agent; besides, the popular worst-case return may lead to overly pessimistic policies. To address these issues, we first theoretically prove that the performance degradation under transition disturbance and observation disturbance depends on a novel metric of Value Function Range (VFR), which corresponds to the gap in the value function between the best state and the wor...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
This letter aims to solve a safe reinforcement learning (RL) problem with risk measure-based constra...
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected e...
Decision theory addresses the task of choosing an action; it provides robust decision-making criteri...
The objective in a traditional reinforcement learning (RL) problem is to find a policy that optimize...
Replicating the human ability to solve complex planning problems based on minimal prior knowledge ha...
As safety violations can lead to severe consequences in real-world robotic applications, the increas...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
peer reviewedWith regard to future service robots, unsafe exceptional circumstances can occur in com...
Reinforcement learning is a family of machine learning algorithms, in which the system learns to mak...
Risk-sensitive reinforcement learning (RL) aims to optimize policies that balance the expected rewar...
Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in vario...
This paper concerns the efficient construction of a safety shield for reinforcement learning. We spe...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
This letter aims to solve a safe reinforcement learning (RL) problem with risk measure-based constra...
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected e...
Decision theory addresses the task of choosing an action; it provides robust decision-making criteri...
The objective in a traditional reinforcement learning (RL) problem is to find a policy that optimize...
Replicating the human ability to solve complex planning problems based on minimal prior knowledge ha...
As safety violations can lead to severe consequences in real-world robotic applications, the increas...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
peer reviewedWith regard to future service robots, unsafe exceptional circumstances can occur in com...
Reinforcement learning is a family of machine learning algorithms, in which the system learns to mak...
Risk-sensitive reinforcement learning (RL) aims to optimize policies that balance the expected rewar...
Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in vario...
This paper concerns the efficient construction of a safety shield for reinforcement learning. We spe...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...