A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy that maximizes the expected discounted-sum payoff. However, such policies may still permit unlikely but highly undesirable outcomes, which is problematic especially in safety-critical applications. Recently, there has been a surge of interest in POMDPs where the goal is to maximize the probability to ensure that the payoff is at least a given threshold, but these approaches do not consider any optimization beyond satisfying this threshold constraint. In this work we go beyond both the “expectation” and “threshold” approaches and consider a “guaranteed payoff optimization (GPO)” problem for POMDPs, where we are given a threshold t and the obje...
Standard value function approaches to finding policies for Partially Observable Markov Decision Proc...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We study the problem of approximation of optimal values in partially-observable Markov decision proc...
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy ...
Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard fr...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We consider Markov decision processes (MDPs) with multiple limit-average (ormean-payoff) objectives....
AbstractThis study extends the framework of partially observable Markov decision processes (POMDPs) ...
Optimal policy computation in finite-horizon Markov decision processes is a classical problem in opt...
We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
Partially observable Markov decision processes(POMDPs) provide a framework for the optimization of M...
Standard value function approaches to finding policies for Partially Observable Markov Decision Proc...
Standard value function approaches to finding policies for Partially Observable Markov Decision Proc...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We study the problem of approximation of optimal values in partially-observable Markov decision proc...
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy ...
Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard fr...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We consider Markov decision processes (MDPs) with multiple limit-average (ormean-payoff) objectives....
AbstractThis study extends the framework of partially observable Markov decision processes (POMDPs) ...
Optimal policy computation in finite-horizon Markov decision processes is a classical problem in opt...
We consider partially observable Markov decision processes (POMDPs) with limit-average payoff, where...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
Partially observable Markov decision processes(POMDPs) provide a framework for the optimization of M...
Standard value function approaches to finding policies for Partially Observable Markov Decision Proc...
Standard value function approaches to finding policies for Partially Observable Markov Decision Proc...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
We study the problem of approximation of optimal values in partially-observable Markov decision proc...