Preserving In-Context Learning ability in Large Language Model Fine-tuning

Wang, Yihan
Si, Si
Li, Daliang
Lukasik, Michal
Yu, Felix
Hsieh, Cho-Jui
Dhillon, Inderjit S
Kumar, Sanjiv

Publication date

November 2022

Language

English

Abstract

Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-shot learning without changing model parameters. However, as we show, fine-tuning an LLM on any specific task generally destroys its in-context ability. We discover an important cause of this loss, format specialization, where the model overfits to the format of the fine-tuned task and is unable to output anything beyond this format. We further show that format specialization happens at the beginning of fine-tuning. To solve this problem, we propose Prompt Tuning with MOdel Tuning (ProMoT), a simple yet effective two-stage fine-tuning framework that preserves in-context abilities of the pretrained model. ProMoT first trains a soft prompt for ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Preserving In-Context Learning ability in Large Language Model Fine-tuning

Abstract

Extracted data

Preserving In-Context Learning ability in Large Language Model Fine-tuning

Abstract

Extracted data

Related items

Related items