Standard fine-tuning of large pre-trained language models (PLMs) for downstream tasks requires updating hundreds of millions to billions of parameters, and storing a large copy of the PLM weights for every task resulting in increased cost for storing, sharing and serving the models. To address this, parameter-efficient fine-tuning (PEFT) techniques were introduced where small trainable components are injected in the PLM and updated during fine-tuning. We propose AdaMix as a general PEFT method that tunes a mixture of adaptation modules -- given the underlying PEFT method of choice -- introduced in each Transformer layer while keeping most of the PLM weights frozen. For instance, AdaMix can leverage a mixture of adapters like Houlsby or a mi...
Prior work shows that it is possible to expand pretrained Masked Language Models (MLMs) to new langu...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Black-Box Tuning (BBT) is a derivative-free approach to optimize continuous prompt tokens prepended ...
There are growing interests in adapting large-scale language models using parameter-efficient fine-t...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent advancements in Large Language Models (LLMs) have enabled the development of a single model c...
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP t...
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their gen...
We present a new method LiST is short for Lite Prompted Self-Training for parameter-efficient fine-t...
The scale of large pre-trained models (PTMs) poses significant challenges in adapting to downstream ...
Fine-tuning large pre-trained models on downstream tasks has been adopted in a variety of domains re...
Parameter-efficient fine-tuning methods (PEFTs) offer the promise of adapting large pre-trained mode...
Pre-trained language models (PLMs) have demonstrated impressive performance across various downstrea...
A recent family of techniques, dubbed as lightweight fine-tuning methods, facilitates parameter-effi...
Prior work shows that it is possible to expand pretrained Masked Language Models (MLMs) to new langu...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Black-Box Tuning (BBT) is a derivative-free approach to optimize continuous prompt tokens prepended ...
There are growing interests in adapting large-scale language models using parameter-efficient fine-t...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent advancements in Large Language Models (LLMs) have enabled the development of a single model c...
Fine-tuning pre-trained models has been ubiquitously proven to be effective in a wide range of NLP t...
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their gen...
We present a new method LiST is short for Lite Prompted Self-Training for parameter-efficient fine-t...
The scale of large pre-trained models (PTMs) poses significant challenges in adapting to downstream ...
Fine-tuning large pre-trained models on downstream tasks has been adopted in a variety of domains re...
Parameter-efficient fine-tuning methods (PEFTs) offer the promise of adapting large pre-trained mode...
Pre-trained language models (PLMs) have demonstrated impressive performance across various downstrea...
A recent family of techniques, dubbed as lightweight fine-tuning methods, facilitates parameter-effi...
Prior work shows that it is possible to expand pretrained Masked Language Models (MLMs) to new langu...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Black-Box Tuning (BBT) is a derivative-free approach to optimize continuous prompt tokens prepended ...