In offline reinforcement learning (RL), one detrimental issue to policy learning is the error accumulation of deep Q function in out-of-distribution (OOD) areas. Unfortunately, existing offline RL methods are often over-conservative, inevitably hurting generalization performance outside data distribution. In our study, one interesting observation is that deep Q functions approximate well inside the convex hull of training data. Inspired by this, we propose a new method, DOGE (Distance-sensitive Offline RL with better GEneralization). DOGE marries dataset geometry with deep function approximators in offline RL, and enables exploitation in generalizable OOD areas rather than strictly constraining policy within data distribution. Specifically,...
We present a model-based offline reinforcement learning policy performance lower bound that explicit...
Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, withou...
We consider the off-policy evaluation problem of reinforcement learning using deep convolutional neu...
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset withou...
Policy constraint methods to offline reinforcement learning (RL) typically utilize parameterization ...
We consider the offline reinforcement learning problem, where the aim is to learn a decision making ...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
Offline reinforcement learning (RL) provides a promising direction to exploit the massive amount of ...
Offline reinforcement learning (RL) aims to learn policy from the passively collected offline datase...
Sample-efficient offline reinforcement learning (RL) with linear function approximation has been stu...
In this dissertation we develop new methodologies and frameworks to address challenges in offline re...
We consider reinforcement learning (RL) methods in offline domains without additional online data co...
Offline reinforcement learning (RL) concerns pursuing an optimal policy for sequential decision-maki...
The application of Reinforcement Learning (RL) in real world environments can be expensive or risky ...
Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learni...
We present a model-based offline reinforcement learning policy performance lower bound that explicit...
Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, withou...
We consider the off-policy evaluation problem of reinforcement learning using deep convolutional neu...
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset withou...
Policy constraint methods to offline reinforcement learning (RL) typically utilize parameterization ...
We consider the offline reinforcement learning problem, where the aim is to learn a decision making ...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
Offline reinforcement learning (RL) provides a promising direction to exploit the massive amount of ...
Offline reinforcement learning (RL) aims to learn policy from the passively collected offline datase...
Sample-efficient offline reinforcement learning (RL) with linear function approximation has been stu...
In this dissertation we develop new methodologies and frameworks to address challenges in offline re...
We consider reinforcement learning (RL) methods in offline domains without additional online data co...
Offline reinforcement learning (RL) concerns pursuing an optimal policy for sequential decision-maki...
The application of Reinforcement Learning (RL) in real world environments can be expensive or risky ...
Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learni...
We present a model-based offline reinforcement learning policy performance lower bound that explicit...
Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, withou...
We consider the off-policy evaluation problem of reinforcement learning using deep convolutional neu...