The general assumption in reinforcement learning(RL) that agents are free to explore for searching optimal policies limits its applicability in real-world domains where safe exploration is desired. In this paper, we study the problem of constrained RL in episodic MDPs to investigate efficient exploration in safe RL. We formally describe two different constraint schemes frequently considered in empirical studies --- namely, soft constrained RL that focuses on the overall safety satisfaction, and hard constrained RL that aims to provide guarantees throughout learning. While violations may occur in the former scheme, the latter enforces safety by extending the challenging knapsack problem in multi-armed bandits. Accordingly, we propose two nov...
Reinforcement learning (RL) is a machine learning paradigm where an agent learns to interact with an...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...
Constrained reinforcement learning (CRL) has gained significant interest recently, since safety cons...
Many physical systems have underlying safety considerations that require that the policy employed en...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
Reinforcement learning (RL) agents need to explore their environments in order to learn optimal poli...
We address the issue of safety in reinforcement learning. We pose the problem in an episodic framewo...
One approach to guaranteeing safety in Reinforcement Learning is through cost constraints that are d...
Reinforcement learning (RL) focuses on an essential aspect of intelligent behavior – how an agent ca...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
In many real-world applications of reinforcement learning (RL), performing actions requires consumin...
Reinforcement learning (RL) is a machine learning paradigm where an agent learns to interact with an...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...
Constrained reinforcement learning (CRL) has gained significant interest recently, since safety cons...
Many physical systems have underlying safety considerations that require that the policy employed en...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
Reinforcement learning (RL) agents need to explore their environments in order to learn optimal poli...
We address the issue of safety in reinforcement learning. We pose the problem in an episodic framewo...
One approach to guaranteeing safety in Reinforcement Learning is through cost constraints that are d...
Reinforcement learning (RL) focuses on an essential aspect of intelligent behavior – how an agent ca...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
In reinforcement learning (RL), an agent must explore an initially unknown environment in order to l...
In many real-world applications of reinforcement learning (RL), performing actions requires consumin...
Reinforcement learning (RL) is a machine learning paradigm where an agent learns to interact with an...
International audienceOne of the challenges in online reinforcement learning (RL) is that the agent ...
Constrained reinforcement learning (CRL) has gained significant interest recently, since safety cons...