AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

Kwon, Se Jung
Kim, Jeonghoon
Bae, Jeongin
Yoo, Kang Min
Kim, Jin-Hwa
Park, Baeseong
Kim, Byeongwook
Ha, Jung-Woo
Sung, Nako
Lee, Dongsoo

Publication date

October 2022

Language

English

Abstract

There are growing interests in adapting large-scale language models using parameter-efficient fine-tuning methods. However, accelerating the model itself and achieving better inference efficiency through model compression has not been thoroughly explored yet. Model compression could provide the benefits of reducing memory footprints, enabling low-precision computations, and ultimately achieving cost-effective inference. To combine parameter-efficient adaptation and model compression, we propose AlphaTuning consisting of post-training quantization of the pre-trained language model and fine-tuning only some parts of quantized parameters for a target task. Specifically, AlphaTuning works by employing binary-coding quantization, which factorize...

Extracted data

We use cookies to provide a better user experience.

Data Protection

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

Abstract

Extracted data

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

Abstract

Extracted data

Related items

Related items