Comparing Pre-Training Schemes for Luxembourgish BERT Models

Lothritz, Cedric
Ezzini, Saad
Purschke, Christoph
Bissyande, Tegawendé François D Assise
Klein, Jacques
Olariu, Isabella
Boytsov, Andrey
Lefebvre, Clement
Goujon, Anne

Publication date

September 2023

Abstract

peer reviewedDespite the widespread use of pre-trained models in NLP, well-performing pre-trained models for low-resource languages are scarce. To address this issue, we propose two novel BERT models for the Luxembourgish language that improve on the state of the art. We also present an empirical study on both the performance and robustness of the investigated BERT models. We compare the models on a set of downstream NLP tasks and evaluate their robustness against different types of data perturbations. Additionally, we provide novel datasets to evaluate the performance of Luxembourgish language models. Our findings reveal that pre-training a pre-loaded model has a positive effect on both the performance and robustness of fine-tuned models a...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Comparing Pre-Training Schemes for Luxembourgish BERT Models

Abstract

Extracted data

Comparing Pre-Training Schemes for Luxembourgish BERT Models

Abstract

Extracted data

Related items

Related items