In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning. Whereas the maximum entropy RL guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote better exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields...
Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the go...
Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the go...
The training process analysis and termination condition of the training process of a Reinforcement L...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
In this thesis, we study how maximum entropy framework can provide efficient deep reinforcement lear...
We provide new perspectives and inference algorithms for Maximum Entropy (MaxEnt) Inverse Reinforcem...
We provide new perspectives and inference algorithms for Maximum Entropy (MaxEnt) Inverse Reinforcem...
In this paper, we present a new class of Markov decision processes (MDPs), called Tsallis MDPs, with...
We present a framework to address a class of sequential decision making problems. Our framework feat...
We make decisions to maximize our perceived reward, but handcrafting a reward function for an autono...
Reinforcement learning (RL) is an important field of research in machine learning that is increasing...
In the maximum state entropy exploration framework, an agent interacts with a reward-free environmen...
Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the go...
Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the go...
The training process analysis and termination condition of the training process of a Reinforcement L...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
ICML-2023We address the challenge of exploration in reinforcement learning (RL) when the agent opera...
In this thesis, we study how maximum entropy framework can provide efficient deep reinforcement lear...
We provide new perspectives and inference algorithms for Maximum Entropy (MaxEnt) Inverse Reinforcem...
We provide new perspectives and inference algorithms for Maximum Entropy (MaxEnt) Inverse Reinforcem...
In this paper, we present a new class of Markov decision processes (MDPs), called Tsallis MDPs, with...
We present a framework to address a class of sequential decision making problems. Our framework feat...
We make decisions to maximize our perceived reward, but handcrafting a reward function for an autono...
Reinforcement learning (RL) is an important field of research in machine learning that is increasing...
In the maximum state entropy exploration framework, an agent interacts with a reward-free environmen...
Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the go...
Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the go...
The training process analysis and termination condition of the training process of a Reinforcement L...