Approximate Bayes Optimal Policy Search using Neural Networks

Castronovo, Michaël
François-Lavet, Vincent
Fonteneau, Raphaël
Ernst, Damien
Couëtoux, Adrien

Open PDF

Open link

Publication date

February 2017

DOI

10.5220/0006191701420153

Publisher

Scitepress

Abstract

peer reviewedBayesian Reinforcement Learning (BRL) agents aim to maximise the expected collected rewards obtained when interacting with an unknown Markov Decision Process (MDP) while using some prior knowledge. State-of-the-art BRL agents rely on frequent updates of the belief on the MDP, as new observations of the environment are made. This offers theoretical guarantees to converge to an optimum, but is computationally intractable, even on small-scale problems. In this paper, we present a method that circumvents this issue by training a parametric policy able to recommend an action directly from raw observations. Artificial Neural Networks (ANNs) are used to represent this policy, and are trained on the trajectories sampled from the prior....

Extracted data

We use cookies to provide a better user experience.

Data Protection

Approximate Bayes Optimal Policy Search using Neural Networks

Abstract

Extracted data

Approximate Bayes Optimal Policy Search using Neural Networks

Abstract

Extracted data

Related items

Related items