Existing fine-tuning methods either tune all parameters of the pre-trained model (full fine-tuning), which is not efficient, or only tune the last linear layer (linear probing), which suffers a significant accuracy drop compared to the full fine-tuning. In this paper, we propose a new parameter-efficient fine-tuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance of full fine-tuning. In this way, SSF also surprisingly outperforms other parameter-efficient fine-tuning approaches even with a smaller number of tunable parameters. Furthermore, different from some existing parameter-efficient fine-tuning methods (e.g., Adapter or ...
The conventional recipe for maximizing model accuracy is to (1) train multiple models with various h...
Large pre-trained language models have recently gained significant traction due to their improved pe...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
The current modus operandi in adapting pre-trained models involves updating all the backbone paramet...
Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
The impressive performances of deep learning architectures is associated to massive increase of mode...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent advancements have illuminated the efficacy of some tensorization-decomposition Parameter-Effi...
Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inferenc...
In recent years, convolutional neural networks have achieved state-of-the-art performance in a numbe...
Finetuning can be used to tackle domain specific tasks by transferring knowledge learned from pre-tr...
As a dominant paradigm, fine-tuning a pre-trained model on the target data is widely used in many de...
Recent advancements in Large Language Models (LLMs) have enabled the development of a single model c...
The conventional recipe for maximizing model accuracy is to (1) train multiple models with various h...
Large pre-trained language models have recently gained significant traction due to their improved pe...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...
The current modus operandi in adapting pre-trained models involves updating all the backbone paramet...
Visual Parameter-Efficient Fine-Tuning (PEFT) has become a powerful alternative for full fine-tuning...
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset ...
The impressive performances of deep learning architectures is associated to massive increase of mode...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent advancements have illuminated the efficacy of some tensorization-decomposition Parameter-Effi...
Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inferenc...
In recent years, convolutional neural networks have achieved state-of-the-art performance in a numbe...
Finetuning can be used to tackle domain specific tasks by transferring knowledge learned from pre-tr...
As a dominant paradigm, fine-tuning a pre-trained model on the target data is widely used in many de...
Recent advancements in Large Language Models (LLMs) have enabled the development of a single model c...
The conventional recipe for maximizing model accuracy is to (1) train multiple models with various h...
Large pre-trained language models have recently gained significant traction due to their improved pe...
In this paper, we move towards combining large parametric models with non-parametric prototypical ne...