Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations. MLM trains a model to predict a random sample of input tokens that have been replaced by a [MASK] placeholder in a multi-class setting over the entire vocabulary. When pretraining, it is common to use alongside MLM other auxiliary objectives on the token or sequence level to improve downstream performance (e.g. next sentence prediction). However, no previous work so far has attempted in examining whether other simpler linguistically intuitive or not objectives can be used standalone as main pretraining objectives. In this paper, we explore five simple pretraining objectives based on token-...
Pre-trained language model (PTM) has been shown to yield powerful text representations for dense pas...
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their gen...
Masked Language Models (MLMs) have shown superior performances in numerous downstream Natural Langua...
Pre-training a language model and then fine-tuning it for downstream tasks has demonstrated state-of...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years. However, previous ...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream application...
We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the t...
We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the t...
Masked Language Modeling (MLM) has proven to be an essential component of Vision-Language (VL) pretr...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream NLP applica...
Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) en...
Pre-trained language model (PTM) has been shown to yield powerful text representations for dense pas...
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their gen...
Masked Language Models (MLMs) have shown superior performances in numerous downstream Natural Langua...
Pre-training a language model and then fine-tuning it for downstream tasks has demonstrated state-of...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NM...
Recently, the development of pre-trained language models has brought natural language processing (NL...
Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years. However, previous ...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream application...
We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the t...
We introduce FLOTA (Few Longest Token Approximation), a simple yet effective method to improve the t...
Masked Language Modeling (MLM) has proven to be an essential component of Vision-Language (VL) pretr...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream NLP applica...
Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) en...
Pre-trained language model (PTM) has been shown to yield powerful text representations for dense pas...
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their gen...
Masked Language Models (MLMs) have shown superior performances in numerous downstream Natural Langua...