National audienceIn this position paper, we review a problem very common for many NLP tasks: computing similarity (or distances) between texts. We aim at showing that what is often considered as a small component in a broader complex system is very often overlooked, leading to the use of sub-optimal solutions. Indeed, computing similarity with TF-IDF weighting and cosine is often presented as "state-of-theart", while more effective alternatives are in the Information Retrieval (IR) community. Through some experiments on several tasks, we show how this simple calculation of similarity can influence system performance. We consider two particular alternatives. The first is the weighting scheme Okapi-BM25, well known in IR and directly intercha...
Sentence similarity plays an important role in many text-related research and applications such as i...
Document similarity search is to find documents similar to a given query document and return a ranke...
Sentence similarity plays an important role in many text-related research and applications such as i...
National audienceIn this position paper, we review a problem very common for many NLP tasks: computi...
The notion of similarity between texts is fundamental for many applications of Natural Language Proc...
Measuring the similarity between two texts is a fundamental problem in many NLP and IR applications....
Measuring the semantic similarity of texts has a vital role in various tasks from the field of natur...
Measuring the semantic similarity of texts has a vital role in various tasks from the field of natur...
Comparing textual content is becoming more and more problematic due to the fact that nowadays data i...
We assess the suitability of word embeddings for practical information retrieval scenarios. Thus, we...
[[abstract]]It is an obstacle for the beginner to use keywords to search for related documents from ...
We present a comprehensive study of computing similarity between texts. We start from the observatio...
The semantic comparison of short sections of text is an emerging aspect of Natural Language Processi...
International audienceImproved the Semantic Similarity with Weighting Vectors Semantic textual simil...
Most text processing systems need to compare lexical units – words, entities, semantic concepts – wi...
Sentence similarity plays an important role in many text-related research and applications such as i...
Document similarity search is to find documents similar to a given query document and return a ranke...
Sentence similarity plays an important role in many text-related research and applications such as i...
National audienceIn this position paper, we review a problem very common for many NLP tasks: computi...
The notion of similarity between texts is fundamental for many applications of Natural Language Proc...
Measuring the similarity between two texts is a fundamental problem in many NLP and IR applications....
Measuring the semantic similarity of texts has a vital role in various tasks from the field of natur...
Measuring the semantic similarity of texts has a vital role in various tasks from the field of natur...
Comparing textual content is becoming more and more problematic due to the fact that nowadays data i...
We assess the suitability of word embeddings for practical information retrieval scenarios. Thus, we...
[[abstract]]It is an obstacle for the beginner to use keywords to search for related documents from ...
We present a comprehensive study of computing similarity between texts. We start from the observatio...
The semantic comparison of short sections of text is an emerging aspect of Natural Language Processi...
International audienceImproved the Semantic Similarity with Weighting Vectors Semantic textual simil...
Most text processing systems need to compare lexical units – words, entities, semantic concepts – wi...
Sentence similarity plays an important role in many text-related research and applications such as i...
Document similarity search is to find documents similar to a given query document and return a ranke...
Sentence similarity plays an important role in many text-related research and applications such as i...