UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Saad-Falcon, Jon
Khattab, Omar
Santhanam, Keshav
Florian, Radu
Franz, Martin
Roukos, Salim
Sil, Avirup
Sultan, Md Arafat
Potts, Christopher

Publication date

October 2023

Language

English

Abstract

Many information retrieval tasks require large labeled datasets for fine-tuning. However, such datasets are often unavailable, and their utility for real-world applications can diminish quickly due to domain shifts. To address this challenge, we develop and motivate a method for using large language models (LLMs) to generate large numbers of synthetic queries cheaply. The method begins by generating a small number of synthetic queries using an expensive LLM. After that, a much less expensive one is used to create large numbers of synthetic queries, which are used to fine-tune a family of reranker models. These rerankers are then distilled into a single efficient retriever for use in the target domain. We show that this technique boosts zero...

Extracted data

We use cookies to provide a better user experience.

Data Protection

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Abstract

Extracted data

UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

Abstract

Extracted data

Related items

Related items