Frustratingly simple pretraining alternatives to masked language modeling

Yamaguchi, A.
Chrysostomou, G.
Margatina, K.
Aletras, N.

Abstract

Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations. MLM trains a model to predict a random sample of input tokens that have been replaced by a [MASK] placeholder in a multi-class setting over the entire vocabulary. When pretraining, it is common to use alongside MLM other auxiliary objectives on the token or sequence level to improve downstream performance (e.g. next sentence prediction). However, no previous work so far has attempted in examining whether other simpler linguistically intuitive or not objectives can be used standalone as main pretraining objectives. In this paper, we explore five simple pretraining objectives based on token-...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Frustratingly simple pretraining alternatives to masked language modeling

Abstract

Extracted data

Frustratingly simple pretraining alternatives to masked language modeling

Abstract

Extracted data

Related items

Related items