AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

Wang, Yaqing
Agarwal, Sahaj
Mukherjee, Subhabrata
Liu, Xiaodong
Gao, Jing
Awadallah, Ahmed Hassan
Gao, Jianfeng

Publication date

November 2022

Abstract

Standard fine-tuning of large pre-trained language models (PLMs) for downstream tasks requires updating hundreds of millions to billions of parameters, and storing a large copy of the PLM weights for every task resulting in increased cost for storing, sharing and serving the models. To address this, parameter-efficient fine-tuning (PEFT) techniques were introduced where small trainable components are injected in the PLM and updated during fine-tuning. We propose AdaMix as a general PEFT method that tunes a mixture of adaptation modules -- given the underlying PEFT method of choice -- introduced in each Transformer layer while keeping most of the PLM weights frozen. For instance, AdaMix can leverage a mixture of adapters like Houlsby or a mi...

Extracted data

We use cookies to provide a better user experience.

Data Protection

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

Abstract

Extracted data

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

Abstract

Extracted data

Related items

Related items