We present new algorithms for computing and approximating bisimulation metrics in Markov Decision Processes (MDPs). Bisimulation metrics are an elegant formalism that capture behavioral equivalence between states and provide strong theoretical guarantees on differences in optimal behaviour. Unfortunately, their computation is expensive and requires a tabular representation of the states, which has thus far rendered them impractical for large problems. In this paper we present a new version of the metric that is tied to a behavior policy in an MDP, along with an analysis of its theoretical properties. We then present two new algorithms for approximating bisimulation metrics in large, deterministic MDPs. The first does so via sampling and is ...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
We provide a novel, flexible, iterative refinement algorithm to automatically construct an approxima...
We provide a novel, flexible, iterative refinement algorithm to automatically construct an approxima...
We define a metric for measuring behavior similarity between states in a Markov decision process (MD...
Bisimulation is a notion of behavioural equiva-lence on the states of a transition system. Its defi-...
International audienceBisimulation is a notion of behavioural equivalence on the statesof a transiti...
International audienceBisimulation is a notion of behavioural equivalence on the statesof a transiti...
State abstraction and value function approximation are essential tools for the feasibility of sequen...
Probabilistic bisimulation is a widely studied equivalence relation for stochastic systems. However,...
Solution methods for MDPs employing approximation allow for more acceptable computation time in dom...
\u3cp\u3eIn this work we introduce new approximate similarity relations that are shown to be key for...
This dissertation addresses the problem of sequential decision making under uncertainty in large sys...
A popular approach to solving large probabilis-tic systems relies on aggregating states based on a m...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
We provide a novel, flexible, iterative refinement algorithm to automatically construct an approxima...
We provide a novel, flexible, iterative refinement algorithm to automatically construct an approxima...
We define a metric for measuring behavior similarity between states in a Markov decision process (MD...
Bisimulation is a notion of behavioural equiva-lence on the states of a transition system. Its defi-...
International audienceBisimulation is a notion of behavioural equivalence on the statesof a transiti...
International audienceBisimulation is a notion of behavioural equivalence on the statesof a transiti...
State abstraction and value function approximation are essential tools for the feasibility of sequen...
Probabilistic bisimulation is a widely studied equivalence relation for stochastic systems. However,...
Solution methods for MDPs employing approximation allow for more acceptable computation time in dom...
\u3cp\u3eIn this work we introduce new approximate similarity relations that are shown to be key for...
This dissertation addresses the problem of sequential decision making under uncertainty in large sys...
A popular approach to solving large probabilis-tic systems relies on aggregating states based on a m...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...
In this work we introduce new approximate similarity relations that are shown to be key for policy (...