This thesis tests the hypothesis that distributional deep reinforcement learning (RL) algorithms get an increased performance over expectation based deep RL because of the regularizing effect of fitting a more complex model. This hypothesis was tested by comparing two variations of the distributional QR-DQN algorithm combined with prioritized experience replay. The first variation, called QR-W, prioritizes learning the return distributions. The second one, QR-TD, prioritizes learning the Q-Values. These algorithms were be tested with a range of network architectures. From too large architectures which are prone to overfitting, to smaller ones prone to underfitting. To verify the findings the experiment was done in two environments. As hypot...
peer reviewedUsing deep neural nets as function approximator for reinforcement learning tasks have r...
Enabling mobile robots to autonomously navigate complex environments is essential for real-world dep...
This thesis investigates how general the knowledge stored in deep-Q-networks are. This general knowl...
Recent years have seen a growing interest in the use of deep neural networks as function approximato...
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-m...
Code: https://github.com/google-research/google-research/tree/master/munchausen_rlInternational audi...
In recent years, deep neural networks have powered many successes in deep reinforcement learning (DR...
In Reinforcement learning, Q-learning is the best-known algorithm but it suffers from overestimation...
Deep reinforcement learning (DRL) systems have transformed artificial intelligenceby solving complex...
Deep Reinforcement Learning has yielded proficient controllers for complex tasks. However, these con...
International audienceConsistent and reproducible evaluation of Deep Reinforcement Learning (DRL) is...
Reinforcement learning and especially deep reinforcement learning are research areas which are getti...
Experience replay plays an essential role as an information-generating mechanism in reinforcement le...
Deep neural networks are the most commonly used function approximators in offline reinforcement lear...
peer reviewedUsing deep neural nets as function approximator for reinforcement learning tasks have r...
Enabling mobile robots to autonomously navigate complex environments is essential for real-world dep...
This thesis investigates how general the knowledge stored in deep-Q-networks are. This general knowl...
Recent years have seen a growing interest in the use of deep neural networks as function approximato...
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-m...
Code: https://github.com/google-research/google-research/tree/master/munchausen_rlInternational audi...
In recent years, deep neural networks have powered many successes in deep reinforcement learning (DR...
In Reinforcement learning, Q-learning is the best-known algorithm but it suffers from overestimation...
Deep reinforcement learning (DRL) systems have transformed artificial intelligenceby solving complex...
Deep Reinforcement Learning has yielded proficient controllers for complex tasks. However, these con...
International audienceConsistent and reproducible evaluation of Deep Reinforcement Learning (DRL) is...
Reinforcement learning and especially deep reinforcement learning are research areas which are getti...
Experience replay plays an essential role as an information-generating mechanism in reinforcement le...
Deep neural networks are the most commonly used function approximators in offline reinforcement lear...
peer reviewedUsing deep neural nets as function approximator for reinforcement learning tasks have r...
Enabling mobile robots to autonomously navigate complex environments is essential for real-world dep...
This thesis investigates how general the knowledge stored in deep-Q-networks are. This general knowl...