Large-scale pre-trained language models have achieved impressive results on a wide range of downstream tasks recently. However, fine-tuning an extremely large-scale pre-trained language model on limited target datasets is often plagued by overfitting and representation degradation. In this paper, we propose a Dynamic Parameter Selection (DPS) algorithm for the large-scale pre-trained models during fine-tuning, which adaptively selects a more promising subnetwork to perform staging updates based on gradients of back-propagation. Experiments on the GLUE benchmark show that DPS outperforms previous fine-tuning methods in terms of overall performance and stability, and consistently achieves better results with variable pre-trained language mode...
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP t...
There are growing interests in adapting large-scale language models using parameter-efficient fine-t...
Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-tuning),...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
Language model fine-tuning is essential for modern natural language processing, but is computational...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
Adopting a two-stage paradigm of pretraining followed by fine-tuning, Pretrained Language Models (PL...
Widely used language models (LMs) are typically built by scaling up a two-stage training pipeline: a...
Differentially Private (DP) learning has seen limited success for building large deep learning model...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP t...
There are growing interests in adapting large-scale language models using parameter-efficient fine-t...
Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-tuning),...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
Language model fine-tuning is essential for modern natural language processing, but is computational...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
Adopting a two-stage paradigm of pretraining followed by fine-tuning, Pretrained Language Models (PL...
Widely used language models (LMs) are typically built by scaling up a two-stage training pipeline: a...
Differentially Private (DP) learning has seen limited success for building large deep learning model...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP t...
There are growing interests in adapting large-scale language models using parameter-efficient fine-t...
Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-tuning),...