International audienceEnd-to-end reinforcement learning on images showed significant performance progress in the recent years, especially with regularization to value estimation brought by data augmentation (Yarats et al., 2020). At the same time, domain randomization and representation learning helped push the limits of these algorithms in visually diverse environments, full of distractors and spurious noise, making RL more robust to unrelated visual features. We present DIQL, a method that combines risk invariant regularization and domain randomization to reduce out-of-distribution (OOD) generalization gap for temporal-difference learning. In this work, we draw a link by framing domain randomization as a richer extension of data augmentat...
Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selection pol...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-m...
This is the version of record. It originally appeared on arXiv at http://arxiv.org/abs/1603.00748.Mo...
International audienceEnd-to-end reinforcement learning on images showed significant progress in the...
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset withou...
International audienceDeep reinforcement learning policies, despite their outstanding efficiency in ...
Model-based reinforcement learning (MBRL) has been used to efficiently solve vision-based control ta...
International audienceDespite remarkable successes, Deep Reinforcement Learning (DRL) is not robust ...
Training an agent to solve control tasks directly from high-dimensional images with model-free reinf...
Over the past few years, the acceleration of computing resources and research in deep learning has l...
Reinforcement learning (RL) is a general framework for learning and evaluating intelligent behaviors...
Visual model-based RL methods typically encode image observations into low-dimensional representatio...
If reinforcement learning (RL) techniques are to be used for "real world" dynamic system c...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
In many Reinforcement Learning (RL) tasks, the classical online interaction of the learning agent wi...
Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selection pol...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-m...
This is the version of record. It originally appeared on arXiv at http://arxiv.org/abs/1603.00748.Mo...
International audienceEnd-to-end reinforcement learning on images showed significant progress in the...
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset withou...
International audienceDeep reinforcement learning policies, despite their outstanding efficiency in ...
Model-based reinforcement learning (MBRL) has been used to efficiently solve vision-based control ta...
International audienceDespite remarkable successes, Deep Reinforcement Learning (DRL) is not robust ...
Training an agent to solve control tasks directly from high-dimensional images with model-free reinf...
Over the past few years, the acceleration of computing resources and research in deep learning has l...
Reinforcement learning (RL) is a general framework for learning and evaluating intelligent behaviors...
Visual model-based RL methods typically encode image observations into low-dimensional representatio...
If reinforcement learning (RL) techniques are to be used for "real world" dynamic system c...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
In many Reinforcement Learning (RL) tasks, the classical online interaction of the learning agent wi...
Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selection pol...
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-m...
This is the version of record. It originally appeared on arXiv at http://arxiv.org/abs/1603.00748.Mo...