Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance, extensibility, and interaction of two such adaptations: vocabulary augmentation and script transliteration. Our evaluations on part-of-speech tagging, universal dependency parsing, and named entity recognition in nine diverse low-resource languages uphold the viability of these approaches while raising new questions around how to optimally adapt multilingual models to low-resource settings.Comment: Workshop on Multilingual Representation Learning (MRL) 202
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
There are over 7000 languages spoken on earth, but many of these languages suffer from a dearth of n...
Thesis (Ph.D.)--University of Washington, 2022Modern NLP systems have been highly successful at a wi...
Pretrained multilingual contextual representations have shown great success, but due to the limits o...
Multilingual language models are widely used to extend NLP systems to low-resource languages. Howeve...
Script diversity presents a challenge to Multilingual Language Models (MLLM) by reducing lexical ove...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downst...
Although multilingual pretrained models (mPLMs) enabled support of various natural language processi...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
International audienceTransfer learning based on pretraining language models on a large amount of ra...
© Dr Long DuongNatural language processing (NLP) aims, broadly speaking, to teach computers to under...
In low resource settings, data augmentation strategies are commonly leveraged to improve performance...
This paper investigates very low resource language model pretraining, when less than 100 thousand se...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
There are over 7000 languages spoken on earth, but many of these languages suffer from a dearth of n...
Thesis (Ph.D.)--University of Washington, 2022Modern NLP systems have been highly successful at a wi...
Pretrained multilingual contextual representations have shown great success, but due to the limits o...
Multilingual language models are widely used to extend NLP systems to low-resource languages. Howeve...
Script diversity presents a challenge to Multilingual Language Models (MLLM) by reducing lexical ove...
Supervised deep learning-based approaches have been applied to task-oriented dialog and have proven ...
Transfer learning has led to large gains in performance for nearly all NLP tasks while making downst...
Although multilingual pretrained models (mPLMs) enabled support of various natural language processi...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
International audienceTransfer learning based on pretraining language models on a large amount of ra...
© Dr Long DuongNatural language processing (NLP) aims, broadly speaking, to teach computers to under...
In low resource settings, data augmentation strategies are commonly leveraged to improve performance...
This paper investigates very low resource language model pretraining, when less than 100 thousand se...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
There are over 7000 languages spoken on earth, but many of these languages suffer from a dearth of n...
Thesis (Ph.D.)--University of Washington, 2022Modern NLP systems have been highly successful at a wi...