Abstract. Planning for multiple agents under uncertainty is often based on decentralized partially observable Markov decision processes (Dec-POMDPs), but current methods must de-emphasize long-term effects of actions by a discount factor. In tasks like wireless networking, agents are evaluated by average performance over time, both short and long-term effects of actions are crucial, and discounting based solutions can perform poorly. We show that under a common set of conditions expec-tation maximization (EM) for average reward Dec-POMDPs is stuck in a local optimum. We introduce a new average reward EM method; it outperforms a state of the art discounted-reward Dec-POMDP method in experiments
AbstractThis paper considers the question of under what circumstances average expected reward optima...
Distributed Partially Observable Markov Decision Processes (DEC-POMDPs) are a popular planning frame...
We present a memory-bounded optimization approach for solving infinite-horizon decen-tralized POMDPs...
Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. Whi...
We address two significant drawbacks of state-of-the-art solvers of decentralized POMDPs (DECPOMDPs)...
International audienceConsidering Markovian Decision Processes (MDPs), the meaning of an optimal pol...
In this paper we focus on distributed multiagent planning under uncertainty. For single-agent planni...
We advance the state of the art in optimal solving of decentralized partially observable Markov deci...
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in...
The problem of deriving joint policies for a group of agents that maximize some joint reward functi...
Decentralized policies for information gathering are required when multiple autonomous agents are de...
We advance the state of the art in optimal solving of decentralized partially observable Markov deci...
International audienceOver the past seven years, researchers have been trying to find algorithms for...
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy ...
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy ...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
Distributed Partially Observable Markov Decision Processes (DEC-POMDPs) are a popular planning frame...
We present a memory-bounded optimization approach for solving infinite-horizon decen-tralized POMDPs...
Decentralized POMDPs provide an expressive framework for multi-agent sequential decision making. Whi...
We address two significant drawbacks of state-of-the-art solvers of decentralized POMDPs (DECPOMDPs)...
International audienceConsidering Markovian Decision Processes (MDPs), the meaning of an optimal pol...
In this paper we focus on distributed multiagent planning under uncertainty. For single-agent planni...
We advance the state of the art in optimal solving of decentralized partially observable Markov deci...
Bounded parameter Markov Decision Processes (BMDPs) address the issue of dealing with uncertainty in...
The problem of deriving joint policies for a group of agents that maximize some joint reward functi...
Decentralized policies for information gathering are required when multiple autonomous agents are de...
We advance the state of the art in optimal solving of decentralized partially observable Markov deci...
International audienceOver the past seven years, researchers have been trying to find algorithms for...
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy ...
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy ...
AbstractThis paper considers the question of under what circumstances average expected reward optima...
Distributed Partially Observable Markov Decision Processes (DEC-POMDPs) are a popular planning frame...
We present a memory-bounded optimization approach for solving infinite-horizon decen-tralized POMDPs...