Semantic embeddings have advanced the state of the art for countless natural language processing tasks, and various extensions to multimodal domains, such as visual-semantic embeddings, have been proposed. While the power of visual-semantic embeddings comes from the distillation and enrichment of information through machine learning, their inner workings are poorly understood and there is a shortage of analysis tools. To address this problem, we generalize the notion of probing tasks to the visual-semantic case. To this end, we (i) discuss the formalization of probing tasks for embeddings of image-caption pairs, (ii) define three concrete probing tasks within our general framework, (iii) train classifiers to probe for those properties, and ...
Each time we ask for an object, describe a scene, follow directions or read a document containi...
This paper presents a novel approach for automatically generating image descriptions: visual detecto...
Multimodality in Machine Translation Jindřich Libovický Traditionally, most natural language process...
Semantic embeddings have advanced the state of the art for countless natural language processing tas...
Inspired by recent advances in multimodal learning and machine translation, we introduce an encoder-...
It was a dream to make computers intelligent. Like humans who are capable of understanding informati...
This paper presents a novel approach for automatically generating image descriptions: visual detecto...
In recent years, joint text-image embeddings have significantly improved thanks to the development o...
Comprehending the rich semantics in an image and ordering them in linguistic order are essential to ...
In the recent years, the emergence of deep learning models has greatly advanced computer vision and ...
Learning image representations has been an interesting and challenging problem. When users upload im...
Current approaches to learning semantic representations of sentences often use prior word-level know...
Recently introduced self-supervised methods for image representation learning provide on par or supe...
Language and vision are the two most essential parts of human intelligence for interpreting the real...
Thesis (Ph.D.)--University of Washington, 2017-09A goal of artificial intelligence is to create a sy...
Each time we ask for an object, describe a scene, follow directions or read a document containi...
This paper presents a novel approach for automatically generating image descriptions: visual detecto...
Multimodality in Machine Translation Jindřich Libovický Traditionally, most natural language process...
Semantic embeddings have advanced the state of the art for countless natural language processing tas...
Inspired by recent advances in multimodal learning and machine translation, we introduce an encoder-...
It was a dream to make computers intelligent. Like humans who are capable of understanding informati...
This paper presents a novel approach for automatically generating image descriptions: visual detecto...
In recent years, joint text-image embeddings have significantly improved thanks to the development o...
Comprehending the rich semantics in an image and ordering them in linguistic order are essential to ...
In the recent years, the emergence of deep learning models has greatly advanced computer vision and ...
Learning image representations has been an interesting and challenging problem. When users upload im...
Current approaches to learning semantic representations of sentences often use prior word-level know...
Recently introduced self-supervised methods for image representation learning provide on par or supe...
Language and vision are the two most essential parts of human intelligence for interpreting the real...
Thesis (Ph.D.)--University of Washington, 2017-09A goal of artificial intelligence is to create a sy...
Each time we ask for an object, describe a scene, follow directions or read a document containi...
This paper presents a novel approach for automatically generating image descriptions: visual detecto...
Multimodality in Machine Translation Jindřich Libovický Traditionally, most natural language process...