For reinforcement learning tasks with multiple objectives, it may be advantageous to learn stochastic or non-stationary policies. This paper investigates two novel algorithms for learning non-stationary policies which produce Pareto-optimal behaviour (w-steering and Q-steering), by extending prior work based on the concept of geometric steering. Empirical results demonstrate that both new algorithms offer substantial performance improvements over stationary deterministic policies, while Q-steering significantly outperforms w-steering when the agent has no information about recurrent states within the environment. It is further demonstrated that Q-steering can be used interactively by providing a human decision-maker with a visualisation of ...
The operation of large-scale water resources systems often involves several conflicting and noncomme...
Many real-life problems involve dealing with multiple objectives. For example, in network routing th...
A common approach to address multiobjective problems using reinforcement learning methods is to exte...
For reinforcement learning tasks with multiple objectives, it may be advantageous to learn stochasti...
Many real-world problems involve the optimization of multiple, possibly conflicting ob-jectives. Mul...
Reinforcement learning is a family of machine learning algorithms, in which the system learns to mak...
This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorith...
\u3cp\u3eThis paper describes a novel multi-objective reinforcement learning algorithm. The proposed...
Reinforcement Learning (RL) is a successful technique to train autonomous agents. However, the cla...
This paper addresses the problem of learning multidimensional control actions from delayed rewards. ...
Reinforcement learning is a promising technique for learning agents to adapt their own strategies in...
The solution for a Multi-Objetive Reinforcement Learning problem is a set of Pareto optimal policie...
When an agent learns in a multi-agent environment, the payoff it receives is dependent on the behavi...
In this talk we present PQ-learning, a new Reinforcement Learning (RL) algorithm that determines th...
Reinforcement learning scales poorly when reinforcements are delayed. The problem of propagating inf...
The operation of large-scale water resources systems often involves several conflicting and noncomme...
Many real-life problems involve dealing with multiple objectives. For example, in network routing th...
A common approach to address multiobjective problems using reinforcement learning methods is to exte...
For reinforcement learning tasks with multiple objectives, it may be advantageous to learn stochasti...
Many real-world problems involve the optimization of multiple, possibly conflicting ob-jectives. Mul...
Reinforcement learning is a family of machine learning algorithms, in which the system learns to mak...
This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorith...
\u3cp\u3eThis paper describes a novel multi-objective reinforcement learning algorithm. The proposed...
Reinforcement Learning (RL) is a successful technique to train autonomous agents. However, the cla...
This paper addresses the problem of learning multidimensional control actions from delayed rewards. ...
Reinforcement learning is a promising technique for learning agents to adapt their own strategies in...
The solution for a Multi-Objetive Reinforcement Learning problem is a set of Pareto optimal policie...
When an agent learns in a multi-agent environment, the payoff it receives is dependent on the behavi...
In this talk we present PQ-learning, a new Reinforcement Learning (RL) algorithm that determines th...
Reinforcement learning scales poorly when reinforcements are delayed. The problem of propagating inf...
The operation of large-scale water resources systems often involves several conflicting and noncomme...
Many real-life problems involve dealing with multiple objectives. For example, in network routing th...
A common approach to address multiobjective problems using reinforcement learning methods is to exte...