Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the belief state given past observations. We consider a policy gradient approach for parameterized policy optimization. For that purpose, we investigate sensitivity analysis of the performance measure with respect to the parameters of the policy, focusing on Finite Difference (FD) techniques. We show that the naive FD is subject to variance explosion because of the non-smoothness of the resampling procedure. We propose a more sophisticated FD method which overcomes this problem and establish its consistency
L'optimisation difficile représente une classe de problèmes dont la résolution ne peut être obtenue ...
In recent years, the optimization, statistics and machine learning communities have built momentum i...
AbstractWe present an efficient particle filtering algorithm for multi-scale systems, that is adapte...
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and...
International audienceOur setting is a Partially Observable Markov Decision Process with continuous ...
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and...
This thesis explores new algorithms and results in stochastic control and global optimization throug...
This thesis explores new algorithms and results in stochastic control and global optimization throug...
Partially Observable Markov Decision Processes have gained an increasing interest in many research c...
This thesis explores new algorithms and results in stochastic control and global optimization throug...
We provide a novel method for sensitivity analysis of parametric robust Markov chains. These models ...
Partially observable Markov decision processes are interesting because of their ability to model mos...
The purpose of filtering is to estimate the posterior distribution of the state of a dynamic system ...
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Pr...
We investigate the performance of a class of particle filters (PFs) that can automatically tune thei...
L'optimisation difficile représente une classe de problèmes dont la résolution ne peut être obtenue ...
In recent years, the optimization, statistics and machine learning communities have built momentum i...
AbstractWe present an efficient particle filtering algorithm for multi-scale systems, that is adapte...
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and...
International audienceOur setting is a Partially Observable Markov Decision Process with continuous ...
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and...
This thesis explores new algorithms and results in stochastic control and global optimization throug...
This thesis explores new algorithms and results in stochastic control and global optimization throug...
Partially Observable Markov Decision Processes have gained an increasing interest in many research c...
This thesis explores new algorithms and results in stochastic control and global optimization throug...
We provide a novel method for sensitivity analysis of parametric robust Markov chains. These models ...
Partially observable Markov decision processes are interesting because of their ability to model mos...
The purpose of filtering is to estimate the posterior distribution of the state of a dynamic system ...
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Pr...
We investigate the performance of a class of particle filters (PFs) that can automatically tune thei...
L'optimisation difficile représente une classe de problèmes dont la résolution ne peut être obtenue ...
In recent years, the optimization, statistics and machine learning communities have built momentum i...
AbstractWe present an efficient particle filtering algorithm for multi-scale systems, that is adapte...