We present a novel n-gram based string matching technique, which we call the targeted s-gram matching technique. In the technique, n-grams are classified into categories on the basis of character contiguity in words. The categories are then utilized in matching. The technique was compared with the conventional n-gram technique using adjacent characters as n-grams. Several types of words and word pairs were studied. English, German, and Swedish query keys were matched against their Finnish spelling variants and Finnish morphological variants using a target word list of 119 000 Finnish words. In all cross-lingual tests done, the targeted s-gram matching technique outperformed the conventional n-gram matching technique. The technique was highl...
Multiword units (MWUs), also known as linguistic prefabs or chunks, are increasingly recognised not ...
Rabin Karp algorithm is frequently used to determine the similarity between texts, using the hash fu...
International audienceThis article tackle multilingual automatic alignment. Alignment refers to the ...
The field of Cross-Language Information Retrieval relates techniques close to both the Machine Trans...
The field of Cross-Language Information Retrieval relates techniques close to both the Machine Trans...
For European languages, n-gram has proved to be the cost effective alternative to morphological proc...
© 2014 Pelemans et al.. In this paper we examine several combinations of classical N-gram language m...
This paper presents a method to improve a word alignment model in a phrase-based Statistical Machine...
Several studies regarding excellent exact string matching algorithms can be used to identify similar...
For research evaluation, publication lists need to be matched to entries in large bibliographic data...
International audienceThis paper describes an extension of the n-gram language model: the similar n-...
AbstractN-gram based indexing technique has been proved as a useful technique for efficient document...
Several studies regarding excellent exact string matching algorithms can be used to identify similar...
Text matching is the process of identifying and locating particular text matches in raw data. Text m...
The traditional approach to information retrieval is based on using words as the indexing and search...
Multiword units (MWUs), also known as linguistic prefabs or chunks, are increasingly recognised not ...
Rabin Karp algorithm is frequently used to determine the similarity between texts, using the hash fu...
International audienceThis article tackle multilingual automatic alignment. Alignment refers to the ...
The field of Cross-Language Information Retrieval relates techniques close to both the Machine Trans...
The field of Cross-Language Information Retrieval relates techniques close to both the Machine Trans...
For European languages, n-gram has proved to be the cost effective alternative to morphological proc...
© 2014 Pelemans et al.. In this paper we examine several combinations of classical N-gram language m...
This paper presents a method to improve a word alignment model in a phrase-based Statistical Machine...
Several studies regarding excellent exact string matching algorithms can be used to identify similar...
For research evaluation, publication lists need to be matched to entries in large bibliographic data...
International audienceThis paper describes an extension of the n-gram language model: the similar n-...
AbstractN-gram based indexing technique has been proved as a useful technique for efficient document...
Several studies regarding excellent exact string matching algorithms can be used to identify similar...
Text matching is the process of identifying and locating particular text matches in raw data. Text m...
The traditional approach to information retrieval is based on using words as the indexing and search...
Multiword units (MWUs), also known as linguistic prefabs or chunks, are increasingly recognised not ...
Rabin Karp algorithm is frequently used to determine the similarity between texts, using the hash fu...
International audienceThis article tackle multilingual automatic alignment. Alignment refers to the ...