In this paper, we study the problem of mapping natural language instructions to complex spatial actions in a 3D blocks world. We first introduce a new dataset that pairs complex 3D spatial operations to rich natural language descriptions that require complex spatial and pragmatic interpretations such as “mirroring”, “twisting”, and “balancing”. This dataset, built on the simulation environment of Bisk, Yuret, and Marcu (2016), attains language that is significantly richer and more complex, while also doubling the size of the original dataset in the 2D environment with 100 new world configurations and 250,000 tokens. In addition, we propose a new neural architecture that achieves competitive results while automatically discovering an invento...
Abstract — In this paper, we introduce an abstract rep-resentation for manipulation actions that is ...
Suppose that you are required to describe a route step-bystep to somebody who does not know the envi...
1 reading a book using a laptop watching TV reading a book Figure 1: We predict regions in 3D scenes...
We address the grounding of natural lan-guage to concrete spatial constraints, and inference of impl...
We address the grounding of natural lan-guage to concrete spatial constraints, and inference of impl...
What semantic structures can enable a system to understand and use spatial language in realistic sit...
Spatial understanding is crucial in many real-world problems, yet little progress has been made towa...
To perform tasks specified by natural language instructions, autonomous agents need to extract seman...
We consider mapping unrestricted natural language to formal spatial representations.We describe ongo...
Models that can execute natural language instructions for situated robotic tasks such as assembly an...
We present an interactive text to 3D scene generation system that learns the expected spatial layout...
The interpretation of spatial references is highly contextual, requiring joint inference over both l...
Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration metho...
Many modern machine learning approaches require vast amounts of training data to learn new concepts;...
Spatial understanding is a fundamental problem with wide-reaching real-world applications. The repre...
Abstract — In this paper, we introduce an abstract rep-resentation for manipulation actions that is ...
Suppose that you are required to describe a route step-bystep to somebody who does not know the envi...
1 reading a book using a laptop watching TV reading a book Figure 1: We predict regions in 3D scenes...
We address the grounding of natural lan-guage to concrete spatial constraints, and inference of impl...
We address the grounding of natural lan-guage to concrete spatial constraints, and inference of impl...
What semantic structures can enable a system to understand and use spatial language in realistic sit...
Spatial understanding is crucial in many real-world problems, yet little progress has been made towa...
To perform tasks specified by natural language instructions, autonomous agents need to extract seman...
We consider mapping unrestricted natural language to formal spatial representations.We describe ongo...
Models that can execute natural language instructions for situated robotic tasks such as assembly an...
We present an interactive text to 3D scene generation system that learns the expected spatial layout...
The interpretation of spatial references is highly contextual, requiring joint inference over both l...
Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration metho...
Many modern machine learning approaches require vast amounts of training data to learn new concepts;...
Spatial understanding is a fundamental problem with wide-reaching real-world applications. The repre...
Abstract — In this paper, we introduce an abstract rep-resentation for manipulation actions that is ...
Suppose that you are required to describe a route step-bystep to somebody who does not know the envi...
1 reading a book using a laptop watching TV reading a book Figure 1: We predict regions in 3D scenes...