We propose a novel string search algorithm for data stored once and read many times. Our search method combines the sublinear traversal of the record (as in Boyer Moore or Knuth-Morris-Pratt) with the agglomeration of parts of the record and search pattern into a single character – the algebraic signature – in the manner of Karp-Rabin. Our experiments show that our algorithm is up to seventy times faster for DNA data, up to eleven times faster for ASCII, and up to a six times faster for XML documents compared with an im- plementation of Boyer-Moore. To obtain this speed-up, we store records in encoded form, where each original character is replaced with an algebraic signature. Our method applies to records stored in databases in g...
The proliferation of online text, such as found on the World Wide Web and in online databases, motiv...
We present an algorithm for searching regular expression matches in compressed text. The algorithm r...
In many large chemoinformatics database systems, molecules are represented by long bi-nary fingerpri...
We propose a novel string search algorithm for data stored once and read many times. Our search meth...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
Abstract We present the AS-Index, a new index structure for exact string search in disk resident dat...
Scalable Distributed Data Structures (SDDS) are a class of data structures for multicomputers (a dis...
A new string searching algorithm is Presented aimed at searching for the occurrence of character pat...
This paper describes a new method of indexing and search-ing large binary signature collections to e...
Abstract. Motivated by the imminent growth of massive, highly redun-dant genomic databases we study ...
AS-Index is a new index structure for exact string search in disk resident databases. It uses hashin...
The proliferation of online text, such as on the World Wide Web and in databases, motivates the need...
Introduction. A rapid development of the systems such as Yandex, Google, etc., has predetermined the...
We present a new bit-parallel technique for approximate string matching. We build on two previous te...
The proliferation of online text, such as found on the World Wide Web and in online databases, motiv...
We present an algorithm for searching regular expression matches in compressed text. The algorithm r...
In many large chemoinformatics database systems, molecules are represented by long bi-nary fingerpri...
We propose a novel string search algorithm for data stored once and read many times. Our search meth...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
Abstract We present the AS-Index, a new index structure for exact string search in disk resident dat...
Scalable Distributed Data Structures (SDDS) are a class of data structures for multicomputers (a dis...
A new string searching algorithm is Presented aimed at searching for the occurrence of character pat...
This paper describes a new method of indexing and search-ing large binary signature collections to e...
Abstract. Motivated by the imminent growth of massive, highly redun-dant genomic databases we study ...
AS-Index is a new index structure for exact string search in disk resident databases. It uses hashin...
The proliferation of online text, such as on the World Wide Web and in databases, motivates the need...
Introduction. A rapid development of the systems such as Yandex, Google, etc., has predetermined the...
We present a new bit-parallel technique for approximate string matching. We build on two previous te...
The proliferation of online text, such as found on the World Wide Web and in online databases, motiv...
We present an algorithm for searching regular expression matches in compressed text. The algorithm r...
In many large chemoinformatics database systems, molecules are represented by long bi-nary fingerpri...