This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Policies are synthesised to satisfy a goal, expressed as a temporal logic formula, with maximal probability. Enforcing the RL agent to stay safe during learning might limit the exploration, however we show that the proposed architecture is able to automatically handle the trade-off between efficient progress in exploration (towards goal satisfaction) and ensuring safety. Theoretical guarantees are available on the optimality of the synthesised policies and on the convergence of the learning algorithm. Experimental results are provided to showcase the per...
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied...
Reinforcement learning is an increasingly popular framework that enables robots to learn to perform ...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
Reinforcement learning algorithms discover policies that maximize reward, but do not necessarily gua...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
This paper concerns the efficient construction of a safety shield for reinforcement learning. We spe...
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Deep reinforcement learning (DRL) has shown remarkable success in artificial domains and in some rea...
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from...
We consider the safe reinforcement learning (RL) problem of maximizing utility with extremely low co...
In this article, we present an intermittent framework for safe reinforcement learning (RL) algorithm...
In safety-critical applications, autonomous agents may need to learn in an environment where mistake...
Reinforcement learning (RL) is a general method for agents to learn optimal control policies through...
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied...
Reinforcement learning is an increasingly popular framework that enables robots to learn to perform ...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
Reinforcement learning algorithms discover policies that maximize reward, but do not necessarily gua...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
This paper concerns the efficient construction of a safety shield for reinforcement learning. We spe...
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Deep reinforcement learning (DRL) has shown remarkable success in artificial domains and in some rea...
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from...
We consider the safe reinforcement learning (RL) problem of maximizing utility with extremely low co...
In this article, we present an intermittent framework for safe reinforcement learning (RL) algorithm...
In safety-critical applications, autonomous agents may need to learn in an environment where mistake...
Reinforcement learning (RL) is a general method for agents to learn optimal control policies through...
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied...
Reinforcement learning is an increasingly popular framework that enables robots to learn to perform ...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...