Rare word representation has recently enjoyed a surge of interest, owing to the crucial role that effective handling of infrequent words can play in accurate semantic understanding. However, there is a paucity of reliable benchmarks for evaluation and comparison of these techniques. We show in this paper that the only existing benchmark (the Stanford Rare Word dataset) suffers from low-confidence annotations and limited vocabulary; hence, it does not constitute a solid comparison framework. In order to fill this evaluation gap, we propose CAmbridge Rare word Dataset (CARD-660), an expert-annotated word similarity dataset which provides a highly reliable, yet challenging, benchmark for rare word representation techniques. Through a set of ex...
There are two main types of word repre-sentations: low-dimensional embeddings and high-dimensional d...
Institute for Communicating and Collaborative SystemsLexical-semantic resources, including thesauri ...
Word embeddings — distributed word representations that can be learned from unlabelled data — have b...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Sc...
It is common in modern prediction problems for many predictor variables to be counts of rarely occur...
This paper presents WAGS (Word Alignment Gold Standard), a novel benchmark which allows extensive ev...
We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word ...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Word embeddings are a key component of high-performing natural language processing (NLP) systems, bu...
The most interesting words in scientific texts will often be novel or rare. This presents a challeng...
Word embeddings are a key component of high-performing natural language processing (NLP) systems, bu...
Language model fusion helps smart assistants recognize words which are rare in acoustic data but abu...
Word embedding techniques heavily rely on the abundance of training data for individual words. Given...
Creating word embeddings that reflect semantic relationships encoded in lexical knowledge resources ...
There are two main types of word repre-sentations: low-dimensional embeddings and high-dimensional d...
Institute for Communicating and Collaborative SystemsLexical-semantic resources, including thesauri ...
Word embeddings — distributed word representations that can be learned from unlabelled data — have b...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Sc...
It is common in modern prediction problems for many predictor variables to be counts of rarely occur...
This paper presents WAGS (Word Alignment Gold Standard), a novel benchmark which allows extensive ev...
We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word ...
Pretraining deep neural network architectures with a language modeling objective has brought large i...
Word embeddings are a key component of high-performing natural language processing (NLP) systems, bu...
The most interesting words in scientific texts will often be novel or rare. This presents a challeng...
Word embeddings are a key component of high-performing natural language processing (NLP) systems, bu...
Language model fusion helps smart assistants recognize words which are rare in acoustic data but abu...
Word embedding techniques heavily rely on the abundance of training data for individual words. Given...
Creating word embeddings that reflect semantic relationships encoded in lexical knowledge resources ...
There are two main types of word repre-sentations: low-dimensional embeddings and high-dimensional d...
Institute for Communicating and Collaborative SystemsLexical-semantic resources, including thesauri ...
Word embeddings — distributed word representations that can be learned from unlabelled data — have b...