In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress. In this work, we use a curriculum of progressively growing action spaces to accelerate learning. We assume the environment is out of our control, but that the agent may set an internal curriculum by initially restricting its action space. Our approach uses off-policy reinforcement learning to estimate optimal value functions for multiple action spaces simultaneously and efficiently transfers data, value estimates, and state representations from restricted action spaces to the full task. We show the efficacy of our approach in proof-of-concept control tasks and on challenging large-scale S...
A fundamental challenge for reinforcement learning (RL) is how to achieve effcient exploration in in...
Reinforcement learning is inherently unsafe for use in physical systems, as learning by trial-and-er...
Graduation date: 2010Reinforcement learning in real-world domains suffers from three curses of dimen...
The reinforcement learning (RL) framework formalizes the notion of learning with interactions. Many ...
Deep Reinforcement Learning (DRL), is becoming a popular and mature framework for learning to solve ...
The design of reinforcement learning solutions to many problems artificially constrain the action se...
Summarization: The majority of learning algorithms available today focus on approximating the state ...
This paper presents a new method for the autonomous construction of hierarchical action and state re...
Typical reinforcement learning (RL) agents learn to complete tasks specified by reward functions tai...
StarCraft II poses a grand challenge for reinforcement learning. The main difficulties include huge ...
Applications of reinforcement learning to continuous control tasks often rely on a steady, informati...
Reinforcement learning has long been advertised as the one with the capability to intelligently mimi...
Large state and action spaces are very challenging to reinforcement learning. However, in many domai...
When applying reinforcement learning in domains with very large or continuous state spaces, the expe...
While exploring to find better solutions, an agent performing on-line reinforcement learning (RL) ca...
A fundamental challenge for reinforcement learning (RL) is how to achieve effcient exploration in in...
Reinforcement learning is inherently unsafe for use in physical systems, as learning by trial-and-er...
Graduation date: 2010Reinforcement learning in real-world domains suffers from three curses of dimen...
The reinforcement learning (RL) framework formalizes the notion of learning with interactions. Many ...
Deep Reinforcement Learning (DRL), is becoming a popular and mature framework for learning to solve ...
The design of reinforcement learning solutions to many problems artificially constrain the action se...
Summarization: The majority of learning algorithms available today focus on approximating the state ...
This paper presents a new method for the autonomous construction of hierarchical action and state re...
Typical reinforcement learning (RL) agents learn to complete tasks specified by reward functions tai...
StarCraft II poses a grand challenge for reinforcement learning. The main difficulties include huge ...
Applications of reinforcement learning to continuous control tasks often rely on a steady, informati...
Reinforcement learning has long been advertised as the one with the capability to intelligently mimi...
Large state and action spaces are very challenging to reinforcement learning. However, in many domai...
When applying reinforcement learning in domains with very large or continuous state spaces, the expe...
While exploring to find better solutions, an agent performing on-line reinforcement learning (RL) ca...
A fundamental challenge for reinforcement learning (RL) is how to achieve effcient exploration in in...
Reinforcement learning is inherently unsafe for use in physical systems, as learning by trial-and-er...
Graduation date: 2010Reinforcement learning in real-world domains suffers from three curses of dimen...