Continual learning necessitates the continual adaptation of models to newly emerging tasks while minimizing the catastrophic forgetting of old ones. This is extremely challenging for large language models (LLMs) with vanilla full-parameter tuning due to high computation costs, memory consumption, and forgetting issue. Inspired by the success of parameter-efficient tuning (PET), we propose Continual Parameter-Efficient Tuning (ConPET), a generalizable paradigm for continual task adaptation of LLMs with task-number-independent training complexity. ConPET includes two versions with different application scenarios. First, Static ConPET can adapt former continual learning methods originally designed for relatively smaller models to LLMs through ...
Fine-tuning the entire set of parameters of a large pretrained model has become the mainstream appro...
Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to ...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Parameter-efficient tuning (PET) has been widely explored in recent years because it tunes much fewe...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent work on large language models relies on the intuition that most natural language processing t...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Large Language Models (LMs) are known to encode world knowledge in their parameters as they pretrain...
Large-scale pre-trained language models have achieved impressive results on a wide range of downstre...
This paper considers continual learning of large-scale pretrained neural machine translation model w...
Pre-trained language models (PLMs) have demonstrated impressive performance across various downstrea...
Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many pri...
Deep learning has enjoyed tremendous success over the last decade, but the training of practically u...
Recent advancements in Large Language Models (LLMs) have enabled the development of a single model c...
Large pre-trained, zero-shot capable models have shown considerable success both for standard transf...
Fine-tuning the entire set of parameters of a large pretrained model has become the mainstream appro...
Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to ...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...
Parameter-efficient tuning (PET) has been widely explored in recent years because it tunes much fewe...
Large language models (LLMs) and vision language models (VLMs) demonstrate excellent performance on ...
Recent work on large language models relies on the intuition that most natural language processing t...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Large Language Models (LMs) are known to encode world knowledge in their parameters as they pretrain...
Large-scale pre-trained language models have achieved impressive results on a wide range of downstre...
This paper considers continual learning of large-scale pretrained neural machine translation model w...
Pre-trained language models (PLMs) have demonstrated impressive performance across various downstrea...
Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many pri...
Deep learning has enjoyed tremendous success over the last decade, but the training of practically u...
Recent advancements in Large Language Models (LLMs) have enabled the development of a single model c...
Large pre-trained, zero-shot capable models have shown considerable success both for standard transf...
Fine-tuning the entire set of parameters of a large pretrained model has become the mainstream appro...
Why can pre-trained language models (PLMs) learn universal representations and effectively adapt to ...
Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-...