Learning to Augment for Data-scarce Domain BERT Knowledge Distillation

Feng, Lingyun
Qiu, Minghui
Li, Yaliang
Zheng, Hai-Tao
Shen, Ying

Open link

Publication date

May 2021

DOI

10.1609/aaai.v35i8.16910

Publisher

Association for the Advancement of Artificial Intelligence

Abstract

Despite pre-trained language models such as BERT have achieved appealing performance in a wide range of Natural Language Processing (NLP) tasks, they are computationally expensive to be deployed in real-time applications. A typical method is to adopt knowledge distillation to compress these large pre-trained models (teacher models) to small student models. However, for a target domain with scarce training data, the teacher can hardly pass useful knowledge to the student, which yields performance degradation for the student models. To tackle this problem, we propose a method to learn to augment data for BERT Knowledge Distillation in target domains with scarce labeled data, by learning a cross-domain manipulation scheme that automatically au...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Learning to Augment for Data-scarce Domain BERT Knowledge Distillation

Abstract

Extracted data

Learning to Augment for Data-scarce Domain BERT Knowledge Distillation

Abstract

Extracted data

Related items

Related items