International audienceState-of-the-art NLP systems represent inputs with word embeddings, but these are brittle when faced with Out-of-Vocabulary (OOV) words. To address this issue, we follow the principle of mimick-like models to generate vectors for unseen words, by learning the behavior of pre-trained embeddings using only the surface form of words. We present a simple contrastive learning framework, LOVE, which extends the word representation of an existing pre-trained language model (such as BERT), and makes it robust to OOV with few additional parameters. Extensive evaluations demonstrate that our lightweight model achieves similar or even better performances than prior competitors, both on original datasets and on corrupted variants....
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Sc...
We show how the spellings of known words can help us deal with unknown words in open-vocabulary NLP ...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years. However, previous ...
Representation learning is a research area within machine learning and natural language processing (...
In this paper we present a method to learn word embeddings that are resilient to misspellings. Exist...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
We propose some better word embedding models based on vLBL model and ivLBL model by sharing represen...
Learning high-quality embeddings for rare words is a hard problem because of sparse context informat...
Most embedding models used in natural language processing require retraining of the entire model to ...
Vector based word representation models are typically developed from very large corpora with the hop...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Word embeddings are a key component of high-performing natural language processing (NLP) systems, bu...
The digital era floods us with an excessive amount of text data. To make sense of such data automati...
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Sc...
We show how the spellings of known words can help us deal with unknown words in open-vocabulary NLP ...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years. However, previous ...
Representation learning is a research area within machine learning and natural language processing (...
In this paper we present a method to learn word embeddings that are resilient to misspellings. Exist...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
We propose some better word embedding models based on vLBL model and ivLBL model by sharing represen...
Learning high-quality embeddings for rare words is a hard problem because of sparse context informat...
Most embedding models used in natural language processing require retraining of the entire model to ...
Vector based word representation models are typically developed from very large corpora with the hop...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Word embeddings are a key component of high-performing natural language processing (NLP) systems, bu...
The digital era floods us with an excessive amount of text data. To make sense of such data automati...
Large pre-trained language models such as BERT have been the driving force behind recent improvement...
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Sc...
We show how the spellings of known words can help us deal with unknown words in open-vocabulary NLP ...