Partially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information. We consider the EXPTIME-hard problem of synthesising policies that almost-surely reach some goal state without ever visiting a bad state. In particular, we are interested in computing the winning region, that is, the set of system configurations from which a policy exists that satisfies the reachability specification. A direct application of such a winning region is the safe exploration of POMDPs by, for instance, restricting the behavior of a reinforcement learning agent to the region. We present two algorithms: A novel SAT-based iterative approach and a decision-diagram based alternative. The ...
Partially observable Markov decision processes(POMDPs) provide a modeling framework for a variety of...
Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning pr...
AbstractActing in domains where an agent must plan several steps ahead to achieve a goal can be a ch...
Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning pr...
The Partially Observable Markov Decision Process (POMDP) framework has proven useful in planning dom...
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework t...
Much of reinforcement learning theory is built on top of oracles that are computationally hard to im...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework t...
Partially Observable Markov Decision Processes (POMDPs) provide a rich representation for agents act...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
Colloque avec actes et comité de lecture. internationale.International audienceA new algorithm for s...
Partially observable Markov decision processes (POMDPs) are a natural model for planning problems wh...
POMDPs are standard models for probabilistic planning problems, where an agent interacts with an unc...
Partially observable Markov decision processes(POMDPs) provide a modeling framework for a variety of...
Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning pr...
AbstractActing in domains where an agent must plan several steps ahead to achieve a goal can be a ch...
Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning pr...
The Partially Observable Markov Decision Process (POMDP) framework has proven useful in planning dom...
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework t...
Much of reinforcement learning theory is built on top of oracles that are computationally hard to im...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework t...
Partially Observable Markov Decision Processes (POMDPs) provide a rich representation for agents act...
We consider partially observable Markov decision processes (POMDPs) with a set of target states and ...
Colloque avec actes et comité de lecture. internationale.International audienceA new algorithm for s...
Partially observable Markov decision processes (POMDPs) are a natural model for planning problems wh...
POMDPs are standard models for probabilistic planning problems, where an agent interacts with an unc...
Partially observable Markov decision processes(POMDPs) provide a modeling framework for a variety of...
Partially observable Markov decision processes (POMDPs) are widely used in probabilistic planning pr...
AbstractActing in domains where an agent must plan several steps ahead to achieve a goal can be a ch...