Scaling existing applications and solutions to multiple human languages has traditionally proven to be difficult, mainly due to the language-dependent nature of preprocessing and feature engineering techniques employed in traditional approaches. In this work, we empirically investigate the factors affecting language-independent models built with multilingual representations, including task type, language set and data resource. On two most representative Natural Language Processing tasks --- sentence classification and sequence labeling, we show that language-independent models can be comparable to or even outperforms the models trained using monolingual data, and they are generally more effective on sentence classification. We experiment la...
Thesis (Ph.D.)--University of Washington, 2022Modern NLP systems have been highly successful at a wi...
Recurrent neural networks (RNNs) are exceptionally good models of distributions over natural languag...
Pre-trained multilingual models, such as mBERT, XLM-R and mT5, are used to improve the performance o...
Multilingual language models are widely used to extend NLP systems to low-resource languages. Howeve...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
© Dr Long DuongNatural language processing (NLP) aims, broadly speaking, to teach computers to under...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
There are over 7000 languages spoken on earth, but many of these languages suffer from a dearth of n...
International audienceTransfer learning based on pretraining language models on a large amount of ra...
Although multilingual pretrained models (mPLMs) enabled support of various natural language processi...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Pretrained multilingual language models have become a common tool in transferring NLP capabilities t...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Computational language models (LMs), most notably exemplified by the widespread success of OpenAI's ...
Thesis (Ph.D.)--University of Washington, 2022Modern NLP systems have been highly successful at a wi...
Recurrent neural networks (RNNs) are exceptionally good models of distributions over natural languag...
Pre-trained multilingual models, such as mBERT, XLM-R and mT5, are used to improve the performance o...
Multilingual language models are widely used to extend NLP systems to low-resource languages. Howeve...
How cross-linguistically applicable are NLP models, specifically language models? A fair comparison ...
© Dr Long DuongNatural language processing (NLP) aims, broadly speaking, to teach computers to under...
NLP technologies are uneven for the world's languages as the state-of-the-art models are only availa...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
There are over 7000 languages spoken on earth, but many of these languages suffer from a dearth of n...
International audienceTransfer learning based on pretraining language models on a large amount of ra...
Although multilingual pretrained models (mPLMs) enabled support of various natural language processi...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Pretrained multilingual language models have become a common tool in transferring NLP capabilities t...
Some natural languages belong to the same family or share similar syntactic and/or semantic regulari...
Computational language models (LMs), most notably exemplified by the widespread success of OpenAI's ...
Thesis (Ph.D.)--University of Washington, 2022Modern NLP systems have been highly successful at a wi...
Recurrent neural networks (RNNs) are exceptionally good models of distributions over natural languag...
Pre-trained multilingual models, such as mBERT, XLM-R and mT5, are used to improve the performance o...