In this work, we address risk-averse Bayes-adaptive reinforcement learning. We pose the problem of optimising the conditional value at risk (CVaR) of the total return in Bayes-adaptive Markov decision processes (MDPs). We show that a policy optimising CVaR in this setting is risk-averse to both the epistemic uncertainty due to the prior distribution over MDPs, and the aleatoric uncertainty due to the inherent stochasticity of MDPs. We reformulate the problem as a two-player stochastic game and propose an approximate algorithm based on Monte Carlo tree search and Bayesian optimisation. Our experiments demonstrate that our approach significantly outperforms baseline approaches for this problem
Markov decision processes (MDP) is a standard modeling tool for sequential decision making in a dyna...
This paper considers sequential decision making problems under uncertainty, the tradeoff between the...
Markov Decision Processes are a mathematical framework widely used for stochastic optimization and c...
This dissertation considers a particular aspect of sequential decision making under uncertainty in w...
Markov decision processes (MDPs) are the defacto framework for sequential decision making in the pre...
We consider the dilemma of taking sequential action within a nebulous and costly stochastic system. ...
In stochastic games with incomplete information, the uncertainty is evoked by the lack of knowledge ...
We consider finite-horizon Markov Decision Processes where distributional parameters, such as transi...
In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are thos...
Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elega...
We consider the problem of "optimal learning" for Markov decision processes with uncertain...
Bayesian planning is a formally elegant approach to learning optimal behaviour under model uncertain...
While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensiti...
We motivate and propose a new model for non-cooperative Markov game which considers the interactions...
Bayesian planning is a formally elegant approach to learning optimal behaviour under model uncertain...
Markov decision processes (MDP) is a standard modeling tool for sequential decision making in a dyna...
This paper considers sequential decision making problems under uncertainty, the tradeoff between the...
Markov Decision Processes are a mathematical framework widely used for stochastic optimization and c...
This dissertation considers a particular aspect of sequential decision making under uncertainty in w...
Markov decision processes (MDPs) are the defacto framework for sequential decision making in the pre...
We consider the dilemma of taking sequential action within a nebulous and costly stochastic system. ...
In stochastic games with incomplete information, the uncertainty is evoked by the lack of knowledge ...
We consider finite-horizon Markov Decision Processes where distributional parameters, such as transi...
In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are thos...
Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elega...
We consider the problem of "optimal learning" for Markov decision processes with uncertain...
Bayesian planning is a formally elegant approach to learning optimal behaviour under model uncertain...
While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensiti...
We motivate and propose a new model for non-cooperative Markov game which considers the interactions...
Bayesian planning is a formally elegant approach to learning optimal behaviour under model uncertain...
Markov decision processes (MDP) is a standard modeling tool for sequential decision making in a dyna...
This paper considers sequential decision making problems under uncertainty, the tradeoff between the...
Markov Decision Processes are a mathematical framework widely used for stochastic optimization and c...