We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. In particular, we represent each word with a Gaussian mixture density, where the mean of a mixture component is given by the sum of n-grams. This representation allows the model to share statistical strength across sub-word structures (e.g. Latin roots), producing accurate representations of rare, misspelt, or even unseen words. Moreover, each component of the mixture can capture a different word sense. Probabilistic FastText outperforms both FastText, which has no probabilistic model, and dictionary-level probabilistic embeddings, which do not incorporate subword structures, on several...
We describe NRC's submission to the Anomaly Detection/Text Mining competition organised at the Text ...
Word embedding is a feature learning technique which aims at mapping words from a vocabulary into ve...
Continuous word representations that can capture the semantic information in the corpus are the buil...
We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word ...
Distributed word representations have been widely used and proven to be useful in quite a few natura...
The digital era floods us with an excessive amount of text data. To make sense of such data automati...
We propose a new word embedding model, inspired by GloVe, which is formulated as a feasible least sq...
There are two main types of word repre-sentations: low-dimensional embeddings and high-dimensional d...
We demonstrate the benefits of probabilistic representations due to their expressiveness which allow...
International audienceSeveral recent studies have shown the benefits of combining language and perce...
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Sc...
Traditional natural language processing has been shown to have excessive reliance on human-annotated...
Distributed word embeddings have yielded state-of-the-art performance in many NLP tasks, mainly due ...
The GloVe word embedding model relies on solving a global optimization problem, which can be reformu...
In this paper, word sense disambiguation (WSD) ac-curacy achievable by a probabilistic classier, usi...
We describe NRC's submission to the Anomaly Detection/Text Mining competition organised at the Text ...
Word embedding is a feature learning technique which aims at mapping words from a vocabulary into ve...
Continuous word representations that can capture the semantic information in the corpus are the buil...
We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word ...
Distributed word representations have been widely used and proven to be useful in quite a few natura...
The digital era floods us with an excessive amount of text data. To make sense of such data automati...
We propose a new word embedding model, inspired by GloVe, which is formulated as a feasible least sq...
There are two main types of word repre-sentations: low-dimensional embeddings and high-dimensional d...
We demonstrate the benefits of probabilistic representations due to their expressiveness which allow...
International audienceSeveral recent studies have shown the benefits of combining language and perce...
Pretraining deep language models has led to large performance gains in NLP. Despite this success, Sc...
Traditional natural language processing has been shown to have excessive reliance on human-annotated...
Distributed word embeddings have yielded state-of-the-art performance in many NLP tasks, mainly due ...
The GloVe word embedding model relies on solving a global optimization problem, which can be reformu...
In this paper, word sense disambiguation (WSD) ac-curacy achievable by a probabilistic classier, usi...
We describe NRC's submission to the Anomaly Detection/Text Mining competition organised at the Text ...
Word embedding is a feature learning technique which aims at mapping words from a vocabulary into ve...
Continuous word representations that can capture the semantic information in the corpus are the buil...