We propose a novel string search algorithm for data stored once and read many times. Our search method combines the sublinear traversal of the record (as in Boyer Moore or Knuth-Morris-Pratt) with the agglomeration of parts of the record and search pattern into a single character – the algebraic signature – in the manner of Karp-Rabin. Our experiments show that our algorithm is up to seventy times faster for DNA data, up to eleven times faster for ASCII, and up to a six times faster for XML documents compared with an implementation of Boyer-Moore. To obtain this speed-up, we store records in encoded form, where each original character is replaced with an algebraic signature. Our method applies to records stored in databases in general and t...
Abstract—With the availability of large amounts of DNA data, exact matching of nucleotide sequences ...
Sometimes there is a need to store sensitive data on an untrusted database server. Song, Wagner and ...
Consider the problem of finding the first occurrence of a particular pattern in a (long) string of c...
We propose a novel string search algorithm for data stored once and read many times. Our search met...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
Abstract We present the AS-Index, a new index structure for exact string search in disk resident dat...
Scalable Distributed Data Structures (SDDS) are a class of data structures for multicomputers (a dis...
A new string searching algorithm is Presented aimed at searching for the occurrence of character pat...
AS-Index is a new index structure for exact string search in disk resident databases. It uses hashin...
The proliferation of online text, such as on the World Wide Web and in databases, motivates the need...
This paper describes a new method of indexing and search-ing large binary signature collections to e...
Abstract. Motivated by the imminent growth of massive, highly redun-dant genomic databases we study ...
Introduction. A rapid development of the systems such as Yandex, Google, etc., has predetermined the...
Sometimes there is a need to store sensitive data on an untrusted database server. Song Wagner and P...
Abstract—With the availability of large amounts of DNA data, exact matching of nucleotide sequences ...
Sometimes there is a need to store sensitive data on an untrusted database server. Song, Wagner and ...
Consider the problem of finding the first occurrence of a particular pattern in a (long) string of c...
We propose a novel string search algorithm for data stored once and read many times. Our search met...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
Abstract We present the AS-Index, a new index structure for exact string search in disk resident dat...
Scalable Distributed Data Structures (SDDS) are a class of data structures for multicomputers (a dis...
A new string searching algorithm is Presented aimed at searching for the occurrence of character pat...
AS-Index is a new index structure for exact string search in disk resident databases. It uses hashin...
The proliferation of online text, such as on the World Wide Web and in databases, motivates the need...
This paper describes a new method of indexing and search-ing large binary signature collections to e...
Abstract. Motivated by the imminent growth of massive, highly redun-dant genomic databases we study ...
Introduction. A rapid development of the systems such as Yandex, Google, etc., has predetermined the...
Sometimes there is a need to store sensitive data on an untrusted database server. Song Wagner and P...
Abstract—With the availability of large amounts of DNA data, exact matching of nucleotide sequences ...
Sometimes there is a need to store sensitive data on an untrusted database server. Song, Wagner and ...
Consider the problem of finding the first occurrence of a particular pattern in a (long) string of c...