International audienceDistributional hypothesis relies on the recurrence of information in the contexts of words to associate. However, these Vector space models implementing the approach suffer from data sparsity, and then from a high number of dimensions in the matrix of contextual vectors. If the data sparseness reduction is an important aspect with general corpora. it is also a major issue with specialised corpora that are of much smaller size and with much lower context frequencies. We tackle the problem of data sparsity on specialised texts and we propose a method that allows to make the matrix denser. To achieve this goal, we use synonymy and hypernymy relations acquired on the corpora to normalise and then generalise the distributi...
Institute for Communicating and Collaborative SystemsLexical-semantic resources, including thesauri ...
Distributional similarity is a widely used concept to capture the semantic relatedness of words in v...
International audienceApplying distributional semantic models to medium-size specialized corpora is ...
International audienceDistributional hypothesis relies on the recurrence of information in the conte...
In specialised domains, the applications such as information retrieval for machine translation rely ...
Dans les domaines de spécialité, les applications telles que la recherche d’information ou la traduc...
In specialised domains, the applications such as information retrieval for machine translation rely ...
Distributional semantic models (DSMs) have been effective at representing seman-tics at the word lev...
Vector Space Models are limited with low frequency words due to few available con-texts and data spa...
International audienceEntity normalization (or entity linking) is an important subtask of informatio...
Distributional semantics allows models of linguistic meaning to be derived from observations of lang...
International audienceDistributional semantics models can be built using simple bag-of-word represen...
This paper addresses the methodology of a distributional analysis in a relatively small corpus in a ...
The hypothesis that word co-occurrence statistics extracted from text corpora can provide a basis fo...
In the field of Natural Language Processing, supervised machine learning is commonly used to solve c...
Institute for Communicating and Collaborative SystemsLexical-semantic resources, including thesauri ...
Distributional similarity is a widely used concept to capture the semantic relatedness of words in v...
International audienceApplying distributional semantic models to medium-size specialized corpora is ...
International audienceDistributional hypothesis relies on the recurrence of information in the conte...
In specialised domains, the applications such as information retrieval for machine translation rely ...
Dans les domaines de spécialité, les applications telles que la recherche d’information ou la traduc...
In specialised domains, the applications such as information retrieval for machine translation rely ...
Distributional semantic models (DSMs) have been effective at representing seman-tics at the word lev...
Vector Space Models are limited with low frequency words due to few available con-texts and data spa...
International audienceEntity normalization (or entity linking) is an important subtask of informatio...
Distributional semantics allows models of linguistic meaning to be derived from observations of lang...
International audienceDistributional semantics models can be built using simple bag-of-word represen...
This paper addresses the methodology of a distributional analysis in a relatively small corpus in a ...
The hypothesis that word co-occurrence statistics extracted from text corpora can provide a basis fo...
In the field of Natural Language Processing, supervised machine learning is commonly used to solve c...
Institute for Communicating and Collaborative SystemsLexical-semantic resources, including thesauri ...
Distributional similarity is a widely used concept to capture the semantic relatedness of words in v...
International audienceApplying distributional semantic models to medium-size specialized corpora is ...