We propose Imaginet, a model of learning visually grounded representations of language from coupled textual and visual input. The model consists of two Gated Recurrent Unit networks with shared word embeddings, and uses a multi-task objective by receiving a textual description of a scene and trying to concurrently predict its visual representation and the next word in the sentence. Like humans, it acquires meaning representations for individual words from descriptions of visual scenes. Moreover, it learns to effectively use sequential structure in semantic interpretation of multi-word phrases
International audienceChildren learn the meaning of words and sentences in their native language at ...
Comunicació presentada a: 2016 Conference of the North American Chapter of the Association for Compu...
© Springer International Publishing AG 2016. In this demonstration, we present ReGLL, a system that ...
Recurrent neural networks (RNN) have gained a reputation for producing state-of-the-art results on m...
Language and vision provide complementary information. Integrating both modalities in a single multi...
This electronic version was submitted by the student author. The certified thesis is available in th...
Language and vision provide complementary information. Integrating both modalities in a single multi...
Children learn the meaning of words by being exposed to perceptually rich situations (linguistic dis...
The problem of how abstract symbols, such as those in sys-tems of natural language, may be grounded ...
The problem of how abstract symbols, such as those in systems of natural language, may be grounded i...
We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that learns visual concepts, words, a...
This paper studies whether a perceptual visual system can simulate human-like cognitive capabilities...
This paper studies whether a perceptual visual system can simulate human-like cognitive capabilities...
International audienceChildren learn the meaning of words and sentences in their native language at ...
International audienceChildren learn the meaning of words and sentences in their native language at ...
International audienceChildren learn the meaning of words and sentences in their native language at ...
Comunicació presentada a: 2016 Conference of the North American Chapter of the Association for Compu...
© Springer International Publishing AG 2016. In this demonstration, we present ReGLL, a system that ...
Recurrent neural networks (RNN) have gained a reputation for producing state-of-the-art results on m...
Language and vision provide complementary information. Integrating both modalities in a single multi...
This electronic version was submitted by the student author. The certified thesis is available in th...
Language and vision provide complementary information. Integrating both modalities in a single multi...
Children learn the meaning of words by being exposed to perceptually rich situations (linguistic dis...
The problem of how abstract symbols, such as those in sys-tems of natural language, may be grounded ...
The problem of how abstract symbols, such as those in systems of natural language, may be grounded i...
We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that learns visual concepts, words, a...
This paper studies whether a perceptual visual system can simulate human-like cognitive capabilities...
This paper studies whether a perceptual visual system can simulate human-like cognitive capabilities...
International audienceChildren learn the meaning of words and sentences in their native language at ...
International audienceChildren learn the meaning of words and sentences in their native language at ...
International audienceChildren learn the meaning of words and sentences in their native language at ...
Comunicació presentada a: 2016 Conference of the North American Chapter of the Association for Compu...
© Springer International Publishing AG 2016. In this demonstration, we present ReGLL, a system that ...