The core of self-supervised learning for pre-training language models includes pre-training task design as well as appropriate data augmentation. Most data augmentations in language model pre-training are context-independent. A seminal contextualized augmentation was recently proposed in ELECTRA and achieved state-of-the-art performance by introducing an auxiliary generation network (generator) to produce contextualized data augmentation for the training of a main discrimination network (discriminator). This design, however, introduces extra computation cost of the generator and a need to adjust the relative capability between the generator and the discriminator. In this paper, we propose a self-augmentation strategy (SAS) where a single ne...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Machine-learning models can reach very high performance with supervised training, where they learn f...
Abstract Lately, the self-attention mechanism has marked a new milestone in the field of automatic s...
The core of self-supervised learning for pre-training language models includes pre-training task des...
Deep neural models (e.g. Transformer) naturally learn spurious features, which create a ``shortcut''...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
In self-supervised learning, one trains a model to solve a so-called pretext task on a dataset witho...
Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural l...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream application...
These improvements open many possibilities in solving Natural Language Processing downstream tasks. ...
Self-supervised pre-training of language models usually consists in predicting probability distribut...
With appropriate pre-training on unstructured text, larger and more accurate neural network models c...
The recurrent neural network language model (RNNLM) has been demonstrated to consistently reduce per...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Machine-learning models can reach very high performance with supervised training, where they learn f...
Abstract Lately, the self-attention mechanism has marked a new milestone in the field of automatic s...
The core of self-supervised learning for pre-training language models includes pre-training task des...
Deep neural models (e.g. Transformer) naturally learn spurious features, which create a ``shortcut''...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
In self-supervised learning, one trains a model to solve a so-called pretext task on a dataset witho...
Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural l...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
The current era of natural language processing (NLP) has been defined by the prominence of pre-train...
Unsupervised pretraining models have been shown to facilitate a wide range of downstream application...
These improvements open many possibilities in solving Natural Language Processing downstream tasks. ...
Self-supervised pre-training of language models usually consists in predicting probability distribut...
With appropriate pre-training on unstructured text, larger and more accurate neural network models c...
The recurrent neural network language model (RNNLM) has been demonstrated to consistently reduce per...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Machine-learning models can reach very high performance with supervised training, where they learn f...
Abstract Lately, the self-attention mechanism has marked a new milestone in the field of automatic s...