HyperTuning: Toward Adapting Large Language Models without Back-propagation

Phang, Jason
Mao, Yi
He, Pengcheng
Chen, Weizhu

Publication date

November 2022

Language

English

Abstract

Fine-tuning large language models for different tasks can be costly and inefficient, and even methods that reduce the number of tuned parameters still require full gradient-based optimization. We propose HyperTuning, a novel approach to model adaptation that uses a hypermodel to generate task-specific parameters for a fixed downstream model. We demonstrate a simple setup for hypertuning with HyperT5, a T5-based hypermodel that produces soft prefixes or LoRA parameters for a frozen T5 model from few-shot examples. We train HyperT5 in two stages: first, hyperpretraining with a modified conditional language modeling objective that trains a hypermodel to generate parameters; second, multi-task fine-tuning (MTF) on a large number of diverse lang...

Extracted data

We use cookies to provide a better user experience.

Data Protection

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Abstract

Extracted data

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Abstract

Extracted data

Related items

Related items