International audienceIn some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with policy advice (RLPA) algorithm which leverages this input set and learns to use the best policy in the set for the reinforcement learning task at hand. We prove that RLPA has a sub-linear regret of $\widetilde O(\sqrt{T})$ relative to the best input policy, and that both this regret and its computational complexity are independent of the size of the state and action space. Our empirical simulations support our theoretical analysis. This suggests RLPA may offer significant advantages in large domains where some prior good po...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
We contribute Policy Reuse as a technique to improve a re-inforcement learning agent with guidance f...
In this master thesis, we have tried to solve two of most prominent Reinforcement Learning problems:...
International audienceIn some reinforcement learning problems an agent may be provided with a set of...
International audienceWe consider an agent interacting with an environment in a single stream of act...
Reinforcement learning (RL) focuses on an essential aspect of intelligent behavior – how an agent ca...
Thesis (Ph.D.), Computer Science, Washington State UniversityTransfer learning is a method in machin...
International audienceWe consider the problem of online reinforcement learning when several state re...
We consider an agent interacting with an environment in a single stream of actions, observations, an...
Different from classic Supervised Learning, Reinforcement Learning (RL), is fundamentally interactiv...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
State-of-the-art reinforcement learning (RL) algorithms generally require a large sample of interact...
Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some mea...
In this work, we study model-based reinforcement learning (RL) in unknown stabilizable linear dynami...
Structured reinforcement learning leverages policies with advantageous properties to reach better pe...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
We contribute Policy Reuse as a technique to improve a re-inforcement learning agent with guidance f...
In this master thesis, we have tried to solve two of most prominent Reinforcement Learning problems:...
International audienceIn some reinforcement learning problems an agent may be provided with a set of...
International audienceWe consider an agent interacting with an environment in a single stream of act...
Reinforcement learning (RL) focuses on an essential aspect of intelligent behavior – how an agent ca...
Thesis (Ph.D.), Computer Science, Washington State UniversityTransfer learning is a method in machin...
International audienceWe consider the problem of online reinforcement learning when several state re...
We consider an agent interacting with an environment in a single stream of actions, observations, an...
Different from classic Supervised Learning, Reinforcement Learning (RL), is fundamentally interactiv...
The field of Reinforcement Learning is concerned with teaching agents to take optimal decisions t...
State-of-the-art reinforcement learning (RL) algorithms generally require a large sample of interact...
Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some mea...
In this work, we study model-based reinforcement learning (RL) in unknown stabilizable linear dynami...
Structured reinforcement learning leverages policies with advantageous properties to reach better pe...
Sequentially making-decision abounds in real-world problems ranging from robots needing to interact ...
We contribute Policy Reuse as a technique to improve a re-inforcement learning agent with guidance f...
In this master thesis, we have tried to solve two of most prominent Reinforcement Learning problems:...