Abstract The problem of reinforcement learning in a non-Markov environment isexplored using a dynamic Bayesian network, where conditional independence assumptions between random variables are compactly representedby network parameters. The parameters are learned on-line, and approximations are used to perform inference and to compute the optimal valuefunction. The relative effects of inference and value function approximations on the quality of the final policy are investigated, by learning tosolve a moderately difficult driving task. The two value function approximations, linear and quadratic, were found to perform similarly, but thequadratic model was more sensitive to initialization. Both performed below the level of human performance on...
AbstractTechniques based on reinforcement learning (RL) have been used to build systems that learn t...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...
Learning to act optimally in the complex world has long been a major goal in artificial intelligence...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning contr...
People are efficient when they make decisions under uncertainty, even when their decisions have long...
Reinforcement learning is a general computational framework for learning sequential decision strate...
In this paper, we describe how techniques from reinforcement learning might be used to approach the ...
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and t...
International audienceMarkovian systems are widely used in reinforcement learning (RL), when the suc...
We present a Bayesian reinforcement learning model with a working memory module which can solve some...
This dissertation considers a particular aspect of sequential decision making under uncertainty in w...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elega...
AbstractActing in domains where an agent must plan several steps ahead to achieve a goal can be a ch...
AbstractTechniques based on reinforcement learning (RL) have been used to build systems that learn t...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...
Learning to act optimally in the complex world has long been a major goal in artificial intelligence...
Increasing attention has been paid to reinforcement learning algorithms in recent years, partly due ...
Reinforcement learning (RL) algorithms provide a sound theoretical basis for building learning contr...
People are efficient when they make decisions under uncertainty, even when their decisions have long...
Reinforcement learning is a general computational framework for learning sequential decision strate...
In this paper, we describe how techniques from reinforcement learning might be used to approach the ...
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and t...
International audienceMarkovian systems are widely used in reinforcement learning (RL), when the suc...
We present a Bayesian reinforcement learning model with a working memory module which can solve some...
This dissertation considers a particular aspect of sequential decision making under uncertainty in w...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elega...
AbstractActing in domains where an agent must plan several steps ahead to achieve a goal can be a ch...
AbstractTechniques based on reinforcement learning (RL) have been used to build systems that learn t...
Partially observable Markov decision processes (POMDPs) are interesting because they provide a gener...
Learning to act optimally in the complex world has long been a major goal in artificial intelligence...