Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models

Diddee, Harshita
Dandapat, Sandipan
Choudhury, Monojit
Ganu, Tanuja
Bali, Kalika

Publication date

November 2022

Language

English

Abstract

Leveraging shared learning through Massively Multilingual Models, state-of-the-art machine translation models are often able to adapt to the paucity of data for low-resource languages. However, this performance comes at the cost of significantly bloated models which are not practically deployable. Knowledge Distillation is one popular technique to develop competitive, lightweight models: In this work, we first evaluate its use to compress MT models focusing on languages with extremely limited training data. Through our analysis across 8 languages, we find that the variance in the performance of the distilled models due to their dependence on priors including the amount of synthetic data used for distillation, the student architecture, train...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models

Abstract

Extracted data

Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models

Abstract

Extracted data

Related items

Related items