An increasing number of complex problems have naturally posed significant challenges in decision-making theory and reinforcement learning practices. These problems often involve multiple conflicting reward signals that inherently cause agents’ poor exploration in seeking a specific goal. In extreme cases, the agent gets stuck in a sub-optimal solution and starts behaving harmfully. To overcome such obstacles, we introduce two actor-critic deep reinforcement learning methods, namely Multi-Critic Single Policy (MCSP) and Single Critic Multi-Policy (SCMP), which can adjust agent behaviors to efficiently achieve a designated goal by adopting a weighted-sum scalarization of different objective functions. In particular, MCSP creates a human-centr...
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles...
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles...
We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for...
Humans live among other humans, not in isolation. Therefore, the ability to learn and behave in mult...
In many decision-making problems, agents aim to balance multiple, possibly conflicting objectives. E...
We posit a new mechanism for cooperation in multi-agent reinforcement learning (MARL) based upon an...
Recent advances of actor-critic methods in deep reinforcement learning have enabled performing sever...
The field of deep reinforcement learning has seen major successes recently, achieving superhuman per...
This paper presents the first actor-critic al-gorithm for off-policy reinforcement learning. Our alg...
This paper presents the first actor-critic al-gorithm for off-policy reinforcement learning. Our alg...
Reinforcement learning (RL) is generally considered as the machine learning answer to the optimal co...
The most basic and primary skills of a robot are pushing and grasping. In cluttered scenes, push to ...
Editor’s Summary: Chapter?? introduced policy gradients as a way to improve on stochastic search of ...
We present an actor-critic scheme for reinforcement learning in complex domains. The main contributi...
© 2019 IEEE. Multiagent reinforcement learning (MARL) algorithms have been demonstrated on complex t...
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles...
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles...
We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for...
Humans live among other humans, not in isolation. Therefore, the ability to learn and behave in mult...
In many decision-making problems, agents aim to balance multiple, possibly conflicting objectives. E...
We posit a new mechanism for cooperation in multi-agent reinforcement learning (MARL) based upon an...
Recent advances of actor-critic methods in deep reinforcement learning have enabled performing sever...
The field of deep reinforcement learning has seen major successes recently, achieving superhuman per...
This paper presents the first actor-critic al-gorithm for off-policy reinforcement learning. Our alg...
This paper presents the first actor-critic al-gorithm for off-policy reinforcement learning. Our alg...
Reinforcement learning (RL) is generally considered as the machine learning answer to the optimal co...
The most basic and primary skills of a robot are pushing and grasping. In cluttered scenes, push to ...
Editor’s Summary: Chapter?? introduced policy gradients as a way to improve on stochastic search of ...
We present an actor-critic scheme for reinforcement learning in complex domains. The main contributi...
© 2019 IEEE. Multiagent reinforcement learning (MARL) algorithms have been demonstrated on complex t...
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles...
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles...
We propose a deep reinforcement learning algorithm that employs an adversarial training strategy for...