Current language models have been criticised for learning language from text alone without connection between words and their meaning. Consequently, multimodal training has been proposed as a way for creating models with better language understanding by providing the lacking connection. We focus on pre-trained multimodal vision-and-language (VL) models for which there already are some results on their language understanding capabilities. An unresolved issue with evaluating the linguistic skills of these models, however, is that there is no established method for adapting them to text-only input without out-of-distribution uncertainty. To find the best approach, we investigate and compare seven possible methods for adapting three different p...
Pretrained models have produced great success in both Computer Vision (CV) and Natural Language Proc...
As transformer evolves, pre-trained models have advanced at a breakneck pace in recent years. They h...
Large language models are known to suffer from the hallucination problem in that they are prone to o...
Current language models have been criticised for learning language from text alone without connectio...
In recent years, joint text-image embeddings have significantly improved thanks to the development o...
International audienceIn recent years, joint text-image embeddings have significantly improved thank...
Vision and Language (VL) models have demonstrated remarkable zero-shot performance in a variety of t...
Language acquisition by children and machines is remarkable. Yet while children learn from hearing a...
Vision-Language Pretraining (VLP) models have recently successfully facilitated many cross-modal dow...
With the burgeoning amount of data of image-text pairs and diversity of Vision-and-Language (V&L) ta...
There are limitations in learning language from text alone. Therefore, recent focus has been on deve...
International audienceVision models trained on multimodal datasets can benefit from the wide availab...
Paper accepted for presentation at the ViGIL 2021 workshop @NAACL. This version: added models to the...
Large language models are known to suffer from the hallucination problem in that they are prone to o...
The problem of learning language models from large text corpora has been widely stud-ied within the ...
Pretrained models have produced great success in both Computer Vision (CV) and Natural Language Proc...
As transformer evolves, pre-trained models have advanced at a breakneck pace in recent years. They h...
Large language models are known to suffer from the hallucination problem in that they are prone to o...
Current language models have been criticised for learning language from text alone without connectio...
In recent years, joint text-image embeddings have significantly improved thanks to the development o...
International audienceIn recent years, joint text-image embeddings have significantly improved thank...
Vision and Language (VL) models have demonstrated remarkable zero-shot performance in a variety of t...
Language acquisition by children and machines is remarkable. Yet while children learn from hearing a...
Vision-Language Pretraining (VLP) models have recently successfully facilitated many cross-modal dow...
With the burgeoning amount of data of image-text pairs and diversity of Vision-and-Language (V&L) ta...
There are limitations in learning language from text alone. Therefore, recent focus has been on deve...
International audienceVision models trained on multimodal datasets can benefit from the wide availab...
Paper accepted for presentation at the ViGIL 2021 workshop @NAACL. This version: added models to the...
Large language models are known to suffer from the hallucination problem in that they are prone to o...
The problem of learning language models from large text corpora has been widely stud-ied within the ...
Pretrained models have produced great success in both Computer Vision (CV) and Natural Language Proc...
As transformer evolves, pre-trained models have advanced at a breakneck pace in recent years. They h...
Large language models are known to suffer from the hallucination problem in that they are prone to o...