In this paper, we move towards combining large parametric models with non-parametric prototypical networks. We propose prototypical fine-tuning, a novel prototypical framework for fine-tuning pretrained language models (LM), which automatically learns a bias to improve predictive performance for varying data sizes, especially low-resource settings. Our prototypical fine-tuning approach can automatically adjust the model capacity according to the number of data points and the model's inherent attributes. Moreover, we propose four principles for effective prototype fine-tuning towards the optimal solution. Experimental results across various datasets show that our work achieves significant performance improvements under various low-resource s...
Adopting a two-stage paradigm of pretraining followed by fine-tuning, Pretrained Language Models (PL...
We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstre...
A recent family of techniques, dubbed as lightweight fine-tuning methods, facilitates parameter-effi...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
Large-scale pre-trained language models have achieved impressive results on a wide range of downstre...
Widely used language models (LMs) are typically built by scaling up a two-stage training pipeline: a...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
State-of-the-art pre-trained language models have been shown to memorise facts and per- form well wi...
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP t...
This paper investigates very low resource language model pretraining, when less than 100 thousand se...
Finetuning can be used to tackle domain specific tasks by transferring knowledge learned from pre-tr...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
The conventional recipe for maximizing model accuracy is to (1) train multiple models with various h...
Adopting a two-stage paradigm of pretraining followed by fine-tuning, Pretrained Language Models (PL...
We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstre...
A recent family of techniques, dubbed as lightweight fine-tuning methods, facilitates parameter-effi...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
Large-scale pre-trained language models have achieved impressive results on a wide range of downstre...
Widely used language models (LMs) are typically built by scaling up a two-stage training pipeline: a...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
State-of-the-art pre-trained language models have been shown to memorise facts and per- form well wi...
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP t...
This paper investigates very low resource language model pretraining, when less than 100 thousand se...
Finetuning can be used to tackle domain specific tasks by transferring knowledge learned from pre-tr...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
The conventional recipe for maximizing model accuracy is to (1) train multiple models with various h...
Adopting a two-stage paradigm of pretraining followed by fine-tuning, Pretrained Language Models (PL...
We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstre...
A recent family of techniques, dubbed as lightweight fine-tuning methods, facilitates parameter-effi...