International audienceIn this paper, we study how word-like units are represented and activated in a recurrent neu-ral model of visually grounded speech. The model used in our experiments is trained to project an image and its spoken description in a common representation space. We show that a recurrent model trained on spoken sentences implicitly segments its input into word-like units and reliably maps them to their correct visual referents. We introduce a methodology originating from linguistics to analyse the representation learned by neural networks-the gating paradigm-and show that the correct representation of a word is only activated if the network has access to first phoneme of the target word, suggesting that the network does not ...
The time course of spoken word recognition depends largely on the frequencies of a word and its comp...
All words of the languages we know are stored in the mental lexicon. Psycholinguistic models describ...
Computational models can reflect the complexity of human behaviour by implementing multiple constrai...
International audienceIn this paper, we study how word-like units are represented and activated in a...
We investigated word recognition in a Visually Grounded Speech model. The model has been trained on ...
A Visually Grounded Speech model is a neural model which is trained to embed image caption pairs clo...
Many computational models of speech recognition assume that the set of target words is already given...
In recent years, deep learning methods allowed the creation of neural models that are able to proces...
Visual world studies show that upon hearing a word in a target-absent visual context containing rela...
International audienceThe language acquisition literature shows that children do not build their lex...
Humans learn language by interaction with their environment and listening to other humans. It should...
We study the representation and encoding of phonemes in a recurrent neural network model of grounded...
We present a method for visually-grounded spoken term discovery. After training either a HuBERT or w...
International audienceHow do we map the rapid input of spoken language onto phonological and lexical...
A new psycholinguistically motivated and neural network based model of human word recognition is pre...
The time course of spoken word recognition depends largely on the frequencies of a word and its comp...
All words of the languages we know are stored in the mental lexicon. Psycholinguistic models describ...
Computational models can reflect the complexity of human behaviour by implementing multiple constrai...
International audienceIn this paper, we study how word-like units are represented and activated in a...
We investigated word recognition in a Visually Grounded Speech model. The model has been trained on ...
A Visually Grounded Speech model is a neural model which is trained to embed image caption pairs clo...
Many computational models of speech recognition assume that the set of target words is already given...
In recent years, deep learning methods allowed the creation of neural models that are able to proces...
Visual world studies show that upon hearing a word in a target-absent visual context containing rela...
International audienceThe language acquisition literature shows that children do not build their lex...
Humans learn language by interaction with their environment and listening to other humans. It should...
We study the representation and encoding of phonemes in a recurrent neural network model of grounded...
We present a method for visually-grounded spoken term discovery. After training either a HuBERT or w...
International audienceHow do we map the rapid input of spoken language onto phonological and lexical...
A new psycholinguistically motivated and neural network based model of human word recognition is pre...
The time course of spoken word recognition depends largely on the frequencies of a word and its comp...
All words of the languages we know are stored in the mental lexicon. Psycholinguistic models describ...
Computational models can reflect the complexity of human behaviour by implementing multiple constrai...