In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with respect to a policy as the probability of entering such a state when the policy is pursued. We consider the problem of finding good policies whose risk is smaller than some user-specified threshold, and formalize it as a constrained MDP with two criteria. The first criterion corresponds to the value function originally given. We will show that the risk can be formulated as a second criterion function based on a cumulative return, whose definition is independent of the original value function. We present a model free, heuristic reinforcement learning algorithm that ai...
In this paper, we describe how techniques from reinforcement learning might be used to approach the ...
We introduce a class of Markov decision problems (MDPs) which greatly simplify Reinforcement Learnin...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
Markov decision processes (MDPs) are the defacto framework for sequential decision making in the pre...
www.cs.tu-berlin.de\∼geibel Abstract. In this article, I will consider Markov Decision Processes wit...
Replicating the human ability to solve complex planning problems based on minimal prior knowledge ha...
This paper concerns the efficient construction of a safety shield for reinforcement learning. We spe...
We present a reinforcement learning approach to explore and optimize a safety-constrained Markov Dec...
This paper considers sequential decision making problems under uncertainty, the tradeoff between the...
Markov decision processes (MDP) is a standard modeling tool for sequential decision making in a dyna...
Many physical systems have underlying safety considerations that require that the policy employed en...
This thesis dives into the theory of discrete time stochastic optimal control through exploring dyna...
This thesis is accomplished in the context of the industrial simulation domain that addresses the pr...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for...
In this paper, we describe how techniques from reinforcement learning might be used to approach the ...
We introduce a class of Markov decision problems (MDPs) which greatly simplify Reinforcement Learnin...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...
Markov decision processes (MDPs) are the defacto framework for sequential decision making in the pre...
www.cs.tu-berlin.de\∼geibel Abstract. In this article, I will consider Markov Decision Processes wit...
Replicating the human ability to solve complex planning problems based on minimal prior knowledge ha...
This paper concerns the efficient construction of a safety shield for reinforcement learning. We spe...
We present a reinforcement learning approach to explore and optimize a safety-constrained Markov Dec...
This paper considers sequential decision making problems under uncertainty, the tradeoff between the...
Markov decision processes (MDP) is a standard modeling tool for sequential decision making in a dyna...
Many physical systems have underlying safety considerations that require that the policy employed en...
This thesis dives into the theory of discrete time stochastic optimal control through exploring dyna...
This thesis is accomplished in the context of the industrial simulation domain that addresses the pr...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
Markov decision processes (MDPs) and their variants are widely studied in the theory of controls for...
In this paper, we describe how techniques from reinforcement learning might be used to approach the ...
We introduce a class of Markov decision problems (MDPs) which greatly simplify Reinforcement Learnin...
Deploying reinforcement learning (RL) involves major concerns around safety. Engineering a reward si...