We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic “risk” from both stochastic state transitions (inherent to the game) and randomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and ...
The unifying theme of this thesis is the design and analysis of adaptive procedures that are aimed a...
This article presents the state estimation method based on Monte Carlo sampling in a partially obser...
This paper investigates the use of model-free reinforcement learning to compute the optimal value in...
A large class of sequential decision making problems under uncertainty with multiple competing decis...
In stochastic games with incomplete information, the uncertainty is evoked by the lack of knowledge ...
Dynamic zero-sum games are a model of multiagent decision-making that has been well-studied in the m...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
A population of agents plays a stochastic dynamic game wherein there is an underlying state process ...
We present a new algorithm for polynomial time learning of optimal behavior in stochastic games. Thi...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
In multi-agent systems, intelligent agents are tasked with making decisions that have optimal outcom...
Learning behaviors in a multiagent environment is crucial for developing and adapting multiagent sys...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
In this manuscript, we develop reinforcement learning theory and algorithms for differential games w...
The unifying theme of this thesis is the design and analysis of adaptive procedures that are aimed a...
This article presents the state estimation method based on Monte Carlo sampling in a partially obser...
This paper investigates the use of model-free reinforcement learning to compute the optimal value in...
A large class of sequential decision making problems under uncertainty with multiple competing decis...
In stochastic games with incomplete information, the uncertainty is evoked by the lack of knowledge ...
Dynamic zero-sum games are a model of multiagent decision-making that has been well-studied in the m...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
We propose a multi-agent reinforcement learning dynamics, and analyze its convergence properties in ...
A population of agents plays a stochastic dynamic game wherein there is an underlying state process ...
We present a new algorithm for polynomial time learning of optimal behavior in stochastic games. Thi...
Dynamic noncooperative multiagent systems are systems where self-interested agents interact with eac...
In multi-agent systems, intelligent agents are tasked with making decisions that have optimal outcom...
Learning behaviors in a multiagent environment is crucial for developing and adapting multiagent sys...
In this paper, we address multi-agent decision problems where all agents share a common goal. This c...
In this manuscript, we develop reinforcement learning theory and algorithms for differential games w...
The unifying theme of this thesis is the design and analysis of adaptive procedures that are aimed a...
This article presents the state estimation method based on Monte Carlo sampling in a partially obser...
This paper investigates the use of model-free reinforcement learning to compute the optimal value in...