Self-supervised pretraining has made few-shot learning possible for many NLP tasks. But the pretraining objectives are not typically adapted specifically for in-context few-shot learning. In this paper, we propose to use self-supervision in an intermediate training stage between pretraining and downstream few-shot usage with the goal to teach the model to perform in-context few shot learning. We propose and evaluate four self-supervised objectives on two benchmarks. We find that the intermediate self-supervision stage produces models that outperform strong baselines. Ablation study shows that several factors affect the downstream performance, such as the amount of training data and the diversity of the self-supervised objectives. Human-anno...
Deep neural networks require large training sets but suffer from high computational cost and long tr...
Few-shot learning aims to train a model with a limited number of base class samples to classify the ...
Few-shot classification requires deep neural networks to learn generalized representations only from...
Existing few-shot learning (FSL) methods rely on training with a large labeled dataset, which preven...
The few-shot learning ability of vision transformers (ViTs) is rarely investigated though heavily de...
Cross-domain few-shot learning (CD-FSL) has drawn increasing attention for handling large difference...
A primary trait of humans is the ability to learn rich representations and relationships between ent...
Deep learning has recently driven remarkable progress in several applications, including image class...
Training a model with limited data is an essential task for machine learning and visual recognition....
A two-stage training paradigm consisting of sequential pre-training and meta-training stages has bee...
Recent advances in transfer learning and few-shot learning largely rely on annotated data related to...
Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unsee...
We present a new method LiST is short for Lite Prompted Self-Training for parameter-efficient fine-t...
Prompt-based fine-tuning has boosted the performance of Pre-trained Language Models (PLMs) on few-sh...
User-defined keyword spotting is a task to detect new spoken terms defined by users. This can be vie...
Deep neural networks require large training sets but suffer from high computational cost and long tr...
Few-shot learning aims to train a model with a limited number of base class samples to classify the ...
Few-shot classification requires deep neural networks to learn generalized representations only from...
Existing few-shot learning (FSL) methods rely on training with a large labeled dataset, which preven...
The few-shot learning ability of vision transformers (ViTs) is rarely investigated though heavily de...
Cross-domain few-shot learning (CD-FSL) has drawn increasing attention for handling large difference...
A primary trait of humans is the ability to learn rich representations and relationships between ent...
Deep learning has recently driven remarkable progress in several applications, including image class...
Training a model with limited data is an essential task for machine learning and visual recognition....
A two-stage training paradigm consisting of sequential pre-training and meta-training stages has bee...
Recent advances in transfer learning and few-shot learning largely rely on annotated data related to...
Few-shot in-context learning (ICL) enables pre-trained language models to perform a previously-unsee...
We present a new method LiST is short for Lite Prompted Self-Training for parameter-efficient fine-t...
Prompt-based fine-tuning has boosted the performance of Pre-trained Language Models (PLMs) on few-sh...
User-defined keyword spotting is a task to detect new spoken terms defined by users. This can be vie...
Deep neural networks require large training sets but suffer from high computational cost and long tr...
Few-shot learning aims to train a model with a limited number of base class samples to classify the ...
Few-shot classification requires deep neural networks to learn generalized representations only from...