This work proposes an approach based on reward shaping techniques in a reinforcement learning setting to approximate the opti- mal decision-making process (also called the optimal policy) in a desired task with a limited amount of data. We extract prior information from an existing family of policies have been used as a heuristic to help the construction of the new one under this challenging condition. We use this approach to study the relationship between the similarity of two tasks and the minimal amount of data needed to compute a near-optimal pol- icy for the second one using the prior information of the existing policy. Preliminary results show that for the least similar existing task consid- ered compared to the desired one,...
We present algorithms to effectively represent a set of Markov decision processes (MDPs), whose opti...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
© 2020 IEEE. A common approach for defining a reward function for multi-objective reinforcement lear...
Transfer learning has proven to be a wildly successful ap-proach for speeding up reinforcement learn...
Reinforcement Learning research is traditionally devoted to solve single-task problems. Therefore, a...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
The goal of task transfer in reinforcement learning is migrating the action policy of an agent to th...
This thesis is mostly focused on reinforcement learning, which is viewed as an optimization problem:...
Reinforcement Learning methods are capable of solving complex problems, but resulting policies might...
Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimi...
Reinforcement learning algorithms are very effective at learning policies (mappings from states to a...
Abstract. We present a Reinforcement Learning (RL) algorithm based on policy iteration for solving a...
Graduation date: 2005Reinforcement learning (RL) is the study of systems that learn from interaction...
We contribute Policy Reuse as a technique to improve a re-inforcement learning agent with guidance f...
We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which e...
We present algorithms to effectively represent a set of Markov decision processes (MDPs), whose opti...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
© 2020 IEEE. A common approach for defining a reward function for multi-objective reinforcement lear...
Transfer learning has proven to be a wildly successful ap-proach for speeding up reinforcement learn...
Reinforcement Learning research is traditionally devoted to solve single-task problems. Therefore, a...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
The goal of task transfer in reinforcement learning is migrating the action policy of an agent to th...
This thesis is mostly focused on reinforcement learning, which is viewed as an optimization problem:...
Reinforcement Learning methods are capable of solving complex problems, but resulting policies might...
Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimi...
Reinforcement learning algorithms are very effective at learning policies (mappings from states to a...
Abstract. We present a Reinforcement Learning (RL) algorithm based on policy iteration for solving a...
Graduation date: 2005Reinforcement learning (RL) is the study of systems that learn from interaction...
We contribute Policy Reuse as a technique to improve a re-inforcement learning agent with guidance f...
We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which e...
We present algorithms to effectively represent a set of Markov decision processes (MDPs), whose opti...
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed ind...
© 2020 IEEE. A common approach for defining a reward function for multi-objective reinforcement lear...