Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding

Du, Mengnan
Mukherjee, Subhabrata
Cheng, Yu
Shokouhi, Milad
Hu, Xia
Awadallah, Ahmed Hassan

Publication date

February 2023

Language

English

Abstract

Recent work has focused on compressing pre-trained language models (PLMs) like BERT where the major focus has been to improve the in-distribution performance for downstream tasks. However, very few of these studies have analyzed the impact of compression on the generalizability and robustness of compressed models for out-of-distribution (OOD) data. Towards this end, we study two popular model compression techniques including knowledge distillation and pruning and show that the compressed models are significantly less robust than their PLM counterparts on OOD test sets although they obtain similar performance on in-distribution development sets for a task. Further analysis indicates that the compressed models overfit on the shortcut samples ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding

Abstract

Extracted data

Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding

Abstract

Extracted data

Related items

Related items