Spatial understanding is crucial in many real-world problems, yet little progress has been made towards building representations that capture spatial knowledge. Here, we move one step forward in this direction and learn such representations by leveraging a task consisting in predicting continuous 2D spatial arrangements of objects given object-relationship-object instances (e.g., “cat under chair”) and a simple neural network model that learns the task from annotated images. We show that the model succeeds in this task and that it is furthermore capable of predicting correct spatial arrangements for unseen objects if either CNN features or word embeddings of the objects are provided. The differences between visual and linguistic features ar...
Neuroscientists postulate 3D representations in the brain in a variety of different coordinate frame...
<div><p>It has been suggested that the map-like representations that support human spatial memory ar...
This manuscript is about a journey. The journey of computer vision and machine learning research fro...
Spatial understanding is a fundamental problem with wide-reaching real-world applications. The repre...
Explicit representations of images are useful for linguistic applications related to images. We desi...
Spatial relations are a basic part of human cognition. However, they are expressed in natural langua...
This paper shows how a standard convolutional neural network (CNN) without recurrent connections is ...
Over the last two decades we have witnessed strong progress on modeling visual object classes, scene...
What semantic structures can enable a system to understand and use spatial language in realistic sit...
Scene representation is the process of converting sensory observations of an environment into compac...
Understanding the spatial relations between objects in images is a surprisingly challenging task. A ...
<p>We introduce a model for incorporating contextual information (such as geography) in learning vec...
Geospatial analysis lacks methods like the word vector representations and pre-trained networks that...
Many modern machine learning approaches require vast amounts of training data to learn new concepts;...
Visual perception plays an essential role in the human recognition system. We heavily rely on visual...
Neuroscientists postulate 3D representations in the brain in a variety of different coordinate frame...
<div><p>It has been suggested that the map-like representations that support human spatial memory ar...
This manuscript is about a journey. The journey of computer vision and machine learning research fro...
Spatial understanding is a fundamental problem with wide-reaching real-world applications. The repre...
Explicit representations of images are useful for linguistic applications related to images. We desi...
Spatial relations are a basic part of human cognition. However, they are expressed in natural langua...
This paper shows how a standard convolutional neural network (CNN) without recurrent connections is ...
Over the last two decades we have witnessed strong progress on modeling visual object classes, scene...
What semantic structures can enable a system to understand and use spatial language in realistic sit...
Scene representation is the process of converting sensory observations of an environment into compac...
Understanding the spatial relations between objects in images is a surprisingly challenging task. A ...
<p>We introduce a model for incorporating contextual information (such as geography) in learning vec...
Geospatial analysis lacks methods like the word vector representations and pre-trained networks that...
Many modern machine learning approaches require vast amounts of training data to learn new concepts;...
Visual perception plays an essential role in the human recognition system. We heavily rely on visual...
Neuroscientists postulate 3D representations in the brain in a variety of different coordinate frame...
<div><p>It has been suggested that the map-like representations that support human spatial memory ar...
This manuscript is about a journey. The journey of computer vision and machine learning research fro...