Deep Q-learning Network (DQN) is a successful way which combines reinforcement learning with deep neural networks and leads to a widespread application of reinforcement learning. One challenging problem when applying DQN or other reinforcement learning algorithms to real world problem is data collection. Therefore, how to improve data efficiency is one of the most important problems in the research of reinforcement learning. In this paper, we propose a framework which uses the Max-Mean loss in Deep Q-Network (M$^2$DQN). Instead of sampling one batch of experiences in the training step, we sample several batches from the experience replay and update the parameters such that the maximum TD-error of these batches is minimized. The proposed met...
peer reviewedWe present a novel approach for learning an ap-proximation of the optimal state-action ...
peer reviewedWe introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-V...
Reinforcement learning can be compared to howhumans learn – by interaction, which is the fundamental...
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It...
Deep Q-Networks algorithm (DQN) was the first reinforcement learning algorithm using deep neural net...
In the past decade, machine learning strategies centered on the use of Deep Neural Networks (DNNs) h...
peer reviewedUsing deep neural nets as function approximator for reinforcement learning tasks have r...
We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Lea...
The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overe...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-m...
Reinforcement Learning is being used to solve various tasks. A Complex Environment is a recent probl...
Various pathologies can occur when independent learners are used in cooperative Multi-Agent Reinforc...
International audienceWe analyze the DQN reinforcement learning algorithm as a stochastic approximat...
Data inefficiency is one of the major challenges for deploying deep reinforcement learning algorithm...
We propose Bayesian Deep Q-Network (BDQN), a practical Thompson sampling based Reinforcement Learnin...
peer reviewedWe present a novel approach for learning an ap-proximation of the optimal state-action ...
peer reviewedWe introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-V...
Reinforcement learning can be compared to howhumans learn – by interaction, which is the fundamental...
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It...
Deep Q-Networks algorithm (DQN) was the first reinforcement learning algorithm using deep neural net...
In the past decade, machine learning strategies centered on the use of Deep Neural Networks (DNNs) h...
peer reviewedUsing deep neural nets as function approximator for reinforcement learning tasks have r...
We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Lea...
The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overe...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-m...
Reinforcement Learning is being used to solve various tasks. A Complex Environment is a recent probl...
Various pathologies can occur when independent learners are used in cooperative Multi-Agent Reinforc...
International audienceWe analyze the DQN reinforcement learning algorithm as a stochastic approximat...
Data inefficiency is one of the major challenges for deploying deep reinforcement learning algorithm...
We propose Bayesian Deep Q-Network (BDQN), a practical Thompson sampling based Reinforcement Learnin...
peer reviewedWe present a novel approach for learning an ap-proximation of the optimal state-action ...
peer reviewedWe introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-V...
Reinforcement learning can be compared to howhumans learn – by interaction, which is the fundamental...