International audienceOffline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, without interactions with the system. An agent in this setting should avoid selecting actions whose consequences cannot be predicted from the data. This is the converse of exploration in RL, which favors such actions. We thus take inspiration from the literature on bonus-based exploration to design a new offline RL agent. The core idea is to subtract a prediction-based exploration bonus from the reward, instead of adding it for exploration. This allows the policy to stay close to the support of the dataset. We connect this approach to a more common regularization of the learned policy towards the data. Instantiated with a bonu...
In this dissertation we develop new methodologies and frameworks to address challenges in offline re...
Offline reinforcement learning (RL) can learn control policies from static datasets but, like standa...
While exploring to find better solutions, an agent performing on-line reinforcement learning (RL) ca...
Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, withou...
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due...
In many Reinforcement Learning (RL) tasks, the classical online interaction of the learning agent wi...
Conventional reinforcement learning (RL) needs an environment to collect fresh data, which is imprac...
Reinforcement learning is a powerful approach for learning control policies that solve sequential de...
Offline reinforcement learning -- learning a policy from a batch of data -- is known to be hard for ...
206 pagesRecent advances in reinforcement learning (RL) provide exciting potential for making agents...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learni...
Offline reinforcement learning (RL) enables effective learning from previously collected data withou...
Offline reinforcement learning algorithms still lack trust in practice due to the risk that the lear...
Building autonomous agents that learn to make predictions and take actions in sequential environment...
In this dissertation we develop new methodologies and frameworks to address challenges in offline re...
Offline reinforcement learning (RL) can learn control policies from static datasets but, like standa...
While exploring to find better solutions, an agent performing on-line reinforcement learning (RL) ca...
Offline Reinforcement Learning (RL) aims at learning an optimal control from a fixed dataset, withou...
Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due...
In many Reinforcement Learning (RL) tasks, the classical online interaction of the learning agent wi...
Conventional reinforcement learning (RL) needs an environment to collect fresh data, which is imprac...
Reinforcement learning is a powerful approach for learning control policies that solve sequential de...
Offline reinforcement learning -- learning a policy from a batch of data -- is known to be hard for ...
206 pagesRecent advances in reinforcement learning (RL) provide exciting potential for making agents...
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that...
Offline reinforcement learning (RL) extends the paradigm of classical RL algorithms to purely learni...
Offline reinforcement learning (RL) enables effective learning from previously collected data withou...
Offline reinforcement learning algorithms still lack trust in practice due to the risk that the lear...
Building autonomous agents that learn to make predictions and take actions in sequential environment...
In this dissertation we develop new methodologies and frameworks to address challenges in offline re...
Offline reinforcement learning (RL) can learn control policies from static datasets but, like standa...
While exploring to find better solutions, an agent performing on-line reinforcement learning (RL) ca...