Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT. Despite its growing popularity, little to no attention has been paid to standardizing and analyzing the design of few-shot experiments. In this work, we highlight a fundamental risk posed by this shortcoming, illustrating that the model exhibits a high degree of sensitivity to the selection of few shots. We conduct a large-scale experimental study on 40 sets of sampled few shots for six diverse NLP tasks across up to 40 languages. We provide an analysis of success and failure cases of few-shot transfer, which highlights the role of lexical features. Additionally, we show that a straightforward full model fin...
Few-shot learning aims to train models that can be generalized to novel classes with only a few samp...
Recent work on multilingual neural machine translation reported competitive performance with respect...
Zero-shot translation is a transfer learning setup that refers to the ability of neural machine tran...
Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretraine...
Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these ...
Recently, there has been an increasing interest in models that generate natural language explanation...
Cross-lingual transfer learning with large multilingual pre-trained models can be an effective appro...
Despite achieving state-of-the-art zero-shot performance, existing vision-language models still fall...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
While recent work on multilingual language models has demonstrated their capacity for cross-lingual ...
Large pre-trained multilingual models such as mBERT and XLM-R enabled effective cross-lingual zero-s...
For many (minority) languages, the resources needed to train large models are not available. We inve...
Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unsee...
It has been shown for English that discrete and soft prompting perform strongly in fewshot learning ...
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downst...
Few-shot learning aims to train models that can be generalized to novel classes with only a few samp...
Recent work on multilingual neural machine translation reported competitive performance with respect...
Zero-shot translation is a transfer learning setup that refers to the ability of neural machine tran...
Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretraine...
Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these ...
Recently, there has been an increasing interest in models that generate natural language explanation...
Cross-lingual transfer learning with large multilingual pre-trained models can be an effective appro...
Despite achieving state-of-the-art zero-shot performance, existing vision-language models still fall...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
While recent work on multilingual language models has demonstrated their capacity for cross-lingual ...
Large pre-trained multilingual models such as mBERT and XLM-R enabled effective cross-lingual zero-s...
For many (minority) languages, the resources needed to train large models are not available. We inve...
Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unsee...
It has been shown for English that discrete and soft prompting perform strongly in fewshot learning ...
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downst...
Few-shot learning aims to train models that can be generalized to novel classes with only a few samp...
Recent work on multilingual neural machine translation reported competitive performance with respect...
Zero-shot translation is a transfer learning setup that refers to the ability of neural machine tran...