Pretraining multilingual language models from scratch requires considerable computational resources and substantial training data. Therefore, a more efficient method is to adapt existing pretrained language models (PLMs) to new languages via vocabulary extension and continued pretraining. However, this method usually randomly initializes the embeddings of new subwords and introduces substantially more embedding parameters to the language model, thus weakening the efficiency. To address these issues, we propose a novel framework: \textbf{O}ne \textbf{F}or \textbf{A}ll (\textbf{\textsc{Ofa}}), which wisely initializes the embeddings of unseen subwords from target languages and thus can adapt a PLM to multiple languages efficiently and effecti...
Large multilingual language models typically rely on a single vocabulary shared across 100+ language...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text c...
Pretrained language models (PLMs) are today the primary model for natural language processing. Despi...
Prior work shows that it is possible to expand pretrained Masked Language Models (MLMs) to new langu...
Pre-trained language models (PLMs) have demonstrated impressive performance across various downstrea...
For multilingual sequence-to-sequence pretrained language models (multilingual Seq2Seq PLMs), e.g. m...
Recent research has discovered that a shared bilingual word embedding space can be induced by projec...
With the recent advances in deep learning, different approaches to improving pre-trained language mo...
Large pretrained language models (PreLMs) are rev-olutionizing natural language processing across al...
Large Language Models (LLMs), trained predominantly on extensive English data, often exhibit limitat...
International audienceTransfer learning based on pretraining language models on a large amount of ra...
Multilingual language models are widely used to extend NLP systems to low-resource languages. Howeve...
Large pretrained multilingual models, trained on dozens of languages, have delivered promising resul...
Pretrained multilingual contextual representations have shown great success, but due to the limits o...
Large multilingual language models typically rely on a single vocabulary shared across 100+ language...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text c...
Pretrained language models (PLMs) are today the primary model for natural language processing. Despi...
Prior work shows that it is possible to expand pretrained Masked Language Models (MLMs) to new langu...
Pre-trained language models (PLMs) have demonstrated impressive performance across various downstrea...
For multilingual sequence-to-sequence pretrained language models (multilingual Seq2Seq PLMs), e.g. m...
Recent research has discovered that a shared bilingual word embedding space can be induced by projec...
With the recent advances in deep learning, different approaches to improving pre-trained language mo...
Large pretrained language models (PreLMs) are rev-olutionizing natural language processing across al...
Large Language Models (LLMs), trained predominantly on extensive English data, often exhibit limitat...
International audienceTransfer learning based on pretraining language models on a large amount of ra...
Multilingual language models are widely used to extend NLP systems to low-resource languages. Howeve...
Large pretrained multilingual models, trained on dozens of languages, have delivered promising resul...
Pretrained multilingual contextual representations have shown great success, but due to the limits o...
Large multilingual language models typically rely on a single vocabulary shared across 100+ language...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
Cross-lingual pre-training has achieved great successes using monolingual and bilingual plain text c...