Unsupervised pretraining models have been shown to facilitate a wide range of downstream applications. These models, however, still encode only the distributional knowledge, incorporated through language modeling objectives. In this work, we complement the encoded distributional knowledge with external lexical knowledge. We generalize the recently proposed (state-of-the-art) unsupervised pretraining model BERT to a multi-task learning setting: we couple BERT's masked language modeling and next sentence prediction objectives with the auxiliary binary word relation classification, through which we inject clean linguistic knowledge into the model. Our initial experiments suggest that our "linguistically-informed" BERT (LIBERT) yields performan...
The core of self-supervised learning for pre-training language models includes pre-training task des...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
Pre-training and fine-tuning have achieved great success in natural language process field. The stan...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream application...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream NLP applica...
Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural l...
Transfer learning is to apply knowledge or patterns learned in a particular field or task to differe...
Since the first bidirectional deep learn- ing model for natural language understanding, BERT, emerge...
Accepted at EACL 2021Multilingual pretrained language models have demonstrated remarkable zero-shot ...
Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) en...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
When pre-trained on large unsupervised textual corpora, language models are able to store and retri...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Recently pre-trained models have achieved state-of-the-art results in various language understanding...
The core of self-supervised learning for pre-training language models includes pre-training task des...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
Pre-training and fine-tuning have achieved great success in natural language process field. The stan...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream application...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream NLP applica...
Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural l...
Transfer learning is to apply knowledge or patterns learned in a particular field or task to differe...
Since the first bidirectional deep learn- ing model for natural language understanding, BERT, emerge...
Accepted at EACL 2021Multilingual pretrained language models have demonstrated remarkable zero-shot ...
Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) en...
Recent research has shown promise in multilingual modeling, demonstrating how a single model is capa...
When pre-trained on large unsupervised textual corpora, language models are able to store and retri...
GPT-2 and BERT demonstrate the effectiveness of using pre-trained language models (LMs) on various n...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Recently pre-trained models have achieved state-of-the-art results in various language understanding...
The core of self-supervised learning for pre-training language models includes pre-training task des...
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to...
Pre-training and fine-tuning have achieved great success in natural language process field. The stan...