Abstract—We study the convergence of Markov decision pro-cesses, composed of a large number of objects, to optimization problems on ordinary differential equations. We show that the optimal reward of such a Markov decision process, which satisfies a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov decision process. We give bounds on the difference of the rewards and an algorithm for deriving an approximating solution to the Markov decision process from a solution of the HJB equations. We illustrate the method on three examples pertaining, respectively, to investment strategies, population dynamics control and scheduling in queues. They are ...
International audienceThis paper investigates the limit behavior of Markov decision processes made o...
Markov decision processes which allow for an unbounded reward structure are considered. Conditions a...
Many processes, such as discrete event systems in engineering or population dynamics in biology, evo...
International audienceWe study the convergence of Markov decision processes, composed of a large num...
We study the convergence of Markov decision processes, composed of a large number of objects, to opt...
Abstract. State-based systems with discrete or continuous time are of-ten modelled with the help of ...
We consider a finite number of $N$ statistically equal individuals, each moving on a finite set of s...
Optimal control provides an appealing machinery to complete complicated control tasks with limited p...
We formally verify executable algorithms for solving Markov decision processes (MDPs) in the interac...
Problems of sequential decisions are marked by the fact that the consequences of a decision made at ...
It is known [2] that policy iteration can be identified with Newton's method (and value iterati...
We derive a new expectation maximization algorithm for policy optimization in linear Gaussian Markov...
We derive a new expectation maximization algorithm for policy optimization in linear Gaussian Markov...
In this paper our objective is to study continuous-time Markov decision processes on a general Borel...
Infinite-horizon non-stationary Markov decision processes provide a general framework to model many ...
International audienceThis paper investigates the limit behavior of Markov decision processes made o...
Markov decision processes which allow for an unbounded reward structure are considered. Conditions a...
Many processes, such as discrete event systems in engineering or population dynamics in biology, evo...
International audienceWe study the convergence of Markov decision processes, composed of a large num...
We study the convergence of Markov decision processes, composed of a large number of objects, to opt...
Abstract. State-based systems with discrete or continuous time are of-ten modelled with the help of ...
We consider a finite number of $N$ statistically equal individuals, each moving on a finite set of s...
Optimal control provides an appealing machinery to complete complicated control tasks with limited p...
We formally verify executable algorithms for solving Markov decision processes (MDPs) in the interac...
Problems of sequential decisions are marked by the fact that the consequences of a decision made at ...
It is known [2] that policy iteration can be identified with Newton's method (and value iterati...
We derive a new expectation maximization algorithm for policy optimization in linear Gaussian Markov...
We derive a new expectation maximization algorithm for policy optimization in linear Gaussian Markov...
In this paper our objective is to study continuous-time Markov decision processes on a general Borel...
Infinite-horizon non-stationary Markov decision processes provide a general framework to model many ...
International audienceThis paper investigates the limit behavior of Markov decision processes made o...
Markov decision processes which allow for an unbounded reward structure are considered. Conditions a...
Many processes, such as discrete event systems in engineering or population dynamics in biology, evo...