DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Chen, Xuxi
Chen, Tianlong
Cheng, Yu
Chen, Weizhu
Wang, Zhangyang
Awadallah, Ahmed Hassan

Publication date

July 2022

Language

English

Abstract

Gigantic pre-trained models have become central to natural language processing (NLP), serving as the starting point for fine-tuning towards a range of downstream tasks. However, two pain points persist for this paradigm: (a) as the pre-trained models grow bigger (e.g., 175B parameters for GPT-3), even the fine-tuning process can be time-consuming and computationally expensive; (b) the fine-tuned model has the same size as its starting point by default, which is neither sensible due to its more specialized functionality, nor practical since many fine-tuned models will be deployed in resource-constrained environments. To address these pain points, we propose a framework for resource- and parameter-efficient fine-tuning by leveraging the spars...

Extracted data

We use cookies to provide a better user experience.

Data Protection

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Abstract

Extracted data

DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

Abstract

Extracted data

Related items

Related items