String similarity join is a basic and essential operation in many applications. In this paper, we investigate the problem of string similarity join with edit distance constraints. A trie-based edit similarity join framework has been proposed recently. The main advantage of existing trie-based algorithms is support for similarity join on short strings. The main problem is when joining long and distant strings. These methods generate and maintain lots of similar prefixes called active nodes which need to be further removed in a subsequent pruning phase. With large edit distance, the number of active nodes becomes quite large. In this paper, we propose a new trie-based join algorithm called PreJoin, which improves upon current trie-based join ...
Similarity Join plays an important role in data integration and cleansing, record linkage and data d...
© 2017 IEEE. String similarity search is a fundamental query that has been widely used for DNA seque...
Conference Name:19th International Conference on Database Systems for Advanced Applications, DASFAA ...
Abstract A string similarity join finds similar pairs between two collections of strings. Many appli...
A string similarity join finds all similar pairs between two collections of strings. It is an essent...
As an essential operation in data cleaning, the similarity join has attracted considerable attention...
Abstract—The string similarity join, which is employed to find similar string pairs from string sets...
In this thesis, we study efficient exact query processing algorithms for edit similarity queries and...
We study the string similarity search problem with edit-distance constraints, which, given a set of ...
We study the string similarity search problem with edit-distance constraints, which, given a set of ...
Edit distance is the most widely used method to quantify similarity between two strings. We investig...
Given a large collection of tree-structured objects (e.g., XML documents), the similarity join finds...
String similarity join is an important operation in data in-tegration and cleansing that finds simil...
A similarity join aims to find all similar pairs between two collections of records. Established alg...
Abstract — Similarity Join is an important operation in data integration and cleansing, record linka...
Similarity Join plays an important role in data integration and cleansing, record linkage and data d...
© 2017 IEEE. String similarity search is a fundamental query that has been widely used for DNA seque...
Conference Name:19th International Conference on Database Systems for Advanced Applications, DASFAA ...
Abstract A string similarity join finds similar pairs between two collections of strings. Many appli...
A string similarity join finds all similar pairs between two collections of strings. It is an essent...
As an essential operation in data cleaning, the similarity join has attracted considerable attention...
Abstract—The string similarity join, which is employed to find similar string pairs from string sets...
In this thesis, we study efficient exact query processing algorithms for edit similarity queries and...
We study the string similarity search problem with edit-distance constraints, which, given a set of ...
We study the string similarity search problem with edit-distance constraints, which, given a set of ...
Edit distance is the most widely used method to quantify similarity between two strings. We investig...
Given a large collection of tree-structured objects (e.g., XML documents), the similarity join finds...
String similarity join is an important operation in data in-tegration and cleansing that finds simil...
A similarity join aims to find all similar pairs between two collections of records. Established alg...
Abstract — Similarity Join is an important operation in data integration and cleansing, record linka...
Similarity Join plays an important role in data integration and cleansing, record linkage and data d...
© 2017 IEEE. String similarity search is a fundamental query that has been widely used for DNA seque...
Conference Name:19th International Conference on Database Systems for Advanced Applications, DASFAA ...