In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. A risk measure often focuses on the worst returns out of the agent's experience. As a result, standard methods for risk-averse RL often ignore high-return strategies. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it. We also devise a novel Cross Entropy module for risk sampling, which (1) preserves risk aversion despite the soft risk; (2) independently improves sample efficiency. By separating the risk aversion of the sampler and the optimizer, we can sample episodes with poor conditions, yet optimize with respect to successful strategies. We combine thes...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Safe exploration is regarded as a key priority area for reinforcement learning research. With separa...
Keeping risk under control is a primary objective in many critical real-world domains, including fin...
Replicating the human ability to solve complex planning problems based on minimal prior knowledge ha...
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due...
The use of reinforcement learning in algorithmic trading is of growing interest, since it offers the...
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected e...
peer reviewedClassical reinforcement learning (RL) techniques are generally concerned with the desig...
We consider the problem of learning models for risk-sensitive reinforcement learning. We theoretical...
This letter aims to solve a safe reinforcement learning (RL) problem with risk measure-based constra...
The objective in a traditional reinforcement learning (RL) problem is to find a policy that optimize...
Accepted at the 5th Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLD...
We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems u...
Reinforcement learning depends on agents being learning individuals, and when agents rely on their i...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Safe exploration is regarded as a key priority area for reinforcement learning research. With separa...
Keeping risk under control is a primary objective in many critical real-world domains, including fin...
Replicating the human ability to solve complex planning problems based on minimal prior knowledge ha...
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due...
The use of reinforcement learning in algorithmic trading is of growing interest, since it offers the...
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected e...
peer reviewedClassical reinforcement learning (RL) techniques are generally concerned with the desig...
We consider the problem of learning models for risk-sensitive reinforcement learning. We theoretical...
This letter aims to solve a safe reinforcement learning (RL) problem with risk measure-based constra...
The objective in a traditional reinforcement learning (RL) problem is to find a policy that optimize...
Accepted at the 5th Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLD...
We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems u...
Reinforcement learning depends on agents being learning individuals, and when agents rely on their i...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
In safe Reinforcement Learning (RL), the agent attempts to find policies which maximize the expectat...
Safe exploration is regarded as a key priority area for reinforcement learning research. With separa...