The Leader-Follower Markov Decision Processes (LF-MDP) framework extends both Markov Decision Processes (MDP) and Stochastic Games. It provides a model where an agent (the leader) can influence a set of other agents (the followers) which are playing a stochastic game, by modifying their immediate reward functions, but not their dynamics. It is assumed that all agents act selfishly and try to optimize their own long-term expected reward. Finding equilibrium strategies in a LF-MDP is hard, especially when the joint state space of followers is factored. In this case, it takes exponential time in the number of followers. Our theoretical contribution is threefold. First, we analyze a natural assumption (substitutability of followers), which hold...
We deal with multi-agent Markov Decision Processes (MDPs) in which co-operation among players is all...
This thesis focuses on Mean Field Game (MFG) theory with applications to consensus, flocking, leader...
The intent of this research is to generate a set of non-dominated policies from which one of two age...
Sustainable animal disease management requires to design and implement control policies at the regio...
The intent of this dissertation is to generate a set of non-dominated finite-memory policies from wh...
We consider an MDP setting in which the reward function is allowed to change during each time step o...
41 pagesWe develop an exhaustive study of Markov decision process (MDP) under mean field interaction...
While formal, decision-theoretic models such as the Markov Decision Process (MDP) have greatly advan...
Abstract—We study large population leader-follower stochastic multi-agent systems where the agents h...
Consider a multi-agent system in a dynamic and uncertain environment. Each agent’s local decision pr...
2014-10-14This dissertation addresses some problems in the area of learning, optimization and decisi...
There has been substantial progress with formal models for sequential decision making by individual ...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
This paper extends the framework of partially observable Markov decision processes (POMDPs) to multi...
We consider a finite number of $N$ statistically equal individuals, each moving on a finite set of s...
We deal with multi-agent Markov Decision Processes (MDPs) in which co-operation among players is all...
This thesis focuses on Mean Field Game (MFG) theory with applications to consensus, flocking, leader...
The intent of this research is to generate a set of non-dominated policies from which one of two age...
Sustainable animal disease management requires to design and implement control policies at the regio...
The intent of this dissertation is to generate a set of non-dominated finite-memory policies from wh...
We consider an MDP setting in which the reward function is allowed to change during each time step o...
41 pagesWe develop an exhaustive study of Markov decision process (MDP) under mean field interaction...
While formal, decision-theoretic models such as the Markov Decision Process (MDP) have greatly advan...
Abstract—We study large population leader-follower stochastic multi-agent systems where the agents h...
Consider a multi-agent system in a dynamic and uncertain environment. Each agent’s local decision pr...
2014-10-14This dissertation addresses some problems in the area of learning, optimization and decisi...
There has been substantial progress with formal models for sequential decision making by individual ...
Time-average Markov decision problems are considered for the finite state and action spaces. Several...
This paper extends the framework of partially observable Markov decision processes (POMDPs) to multi...
We consider a finite number of $N$ statistically equal individuals, each moving on a finite set of s...
We deal with multi-agent Markov Decision Processes (MDPs) in which co-operation among players is all...
This thesis focuses on Mean Field Game (MFG) theory with applications to consensus, flocking, leader...
The intent of this research is to generate a set of non-dominated policies from which one of two age...