Multiobjective reinforcement learning algorithms extend reinforcement learning techniques to problems with multiple conflicting objectives. This paper discusses the advantages gained from applying stochastic policies to multiobjective tasks and examines a particular form of stochastic policy known as a mixture policy. Two methods are proposed for deriving mixture policies for episodic multiobjective tasks from deterministic base policies found via scalarised reinforcement learning. It is shown that these approaches are an efficient means of identifying solutions which offer a superior match to the user’s preferences than can be achieved by methods based strictly on deterministic policies
The solution for a Multi-Objetive Reinforcement Learning problem is a set of Pareto optimal policie...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
In this paper we provide empirical data of the performance of the two most commonly used multiobject...
A common approach to address multiobjective problems using reinforcement learning methods is to exte...
Many real-world problems involve the optimization of multiple, possibly conflicting ob-jectives. Mul...
Many real-life problems involve dealing with multiple objectives. For example, in network routing th...
In many real-world scenarios, the utility of a user is derived from the single execution of a policy...
Reinforcement Learning (RL) is a successful technique to train autonomous agents. However, the cla...
This work describes MPQ-learning, an temporal-difference method that approximates the set of all non...
We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabili...
Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting object...
For reinforcement learning tasks with multiple objectives, it may be advantageous to learn stochasti...
This thesis investigates the following question: Can supervised learning techniques be successfully ...
Real-world sequential decision-making tasks are generally complex, requiring trade-offs between mult...
This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorith...
The solution for a Multi-Objetive Reinforcement Learning problem is a set of Pareto optimal policie...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
In this paper we provide empirical data of the performance of the two most commonly used multiobject...
A common approach to address multiobjective problems using reinforcement learning methods is to exte...
Many real-world problems involve the optimization of multiple, possibly conflicting ob-jectives. Mul...
Many real-life problems involve dealing with multiple objectives. For example, in network routing th...
In many real-world scenarios, the utility of a user is derived from the single execution of a policy...
Reinforcement Learning (RL) is a successful technique to train autonomous agents. However, the cla...
This work describes MPQ-learning, an temporal-difference method that approximates the set of all non...
We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabili...
Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting object...
For reinforcement learning tasks with multiple objectives, it may be advantageous to learn stochasti...
This thesis investigates the following question: Can supervised learning techniques be successfully ...
Real-world sequential decision-making tasks are generally complex, requiring trade-offs between mult...
This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorith...
The solution for a Multi-Objetive Reinforcement Learning problem is a set of Pareto optimal policie...
In this article we study the connection of stochastic optimal control and reinforcement learning. Ou...
In this paper we provide empirical data of the performance of the two most commonly used multiobject...