We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the records stored once and searched many times in a database or a file, especially organized into a Scalable Distributed Data Structure, (SDDS), over a grid or a structured P2P net. We presume that the records are encoded into their cumulative algebraic signatures, providing incidental confidentiality of stored data. The search starts with pre-processing the pattern, calculating the logarithmic algebraic signature (LAS) of the pattern and the LASs of every n-gram in it. The value of n ≥ 1 is a parameter that one may tune. The search attempts to match the LASs of n-grams in the pattern towards dynamically calculated LASs, sampled over n-grams in th...
We present a new bit-parallel technique for approximate string matching. We build on two previous te...
One of the initial hurdles in taking advantage of big data is the ability to quickly analyze and est...
Abstract. Motivated by the imminent growth of massive, highly redun-dant genomic databases we study ...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
We propose a novel string search algorithm for data stored once and read many times. Our search meth...
We propose a novel string search algorithm for data stored once and read many times. Our search met...
Abstract We present the AS-Index, a new index structure for exact string search in disk resident dat...
Scalable Distributed Data Structures (SDDS) are a class of data structures for multicomputers (a dis...
Abstract. Compressed full-text indexes have been one of pattern matching’s most important success st...
This paper describes a new method of indexing and search-ing large binary signature collections to e...
AbstractPattern matching consists of finding occurrences of a pattern in some data. One general appr...
This article presents a new, memory efficient and cache-optimized algorithm for simultaneously searc...
This article presents a new, memory efficient and cache-optimized algorithm for simultaneously searc...
The standard string matching problem involves finding all occurrences of a single pattern in a singl...
One of the initial hurdles in taking advantage of big data is the ability to quickly analyze and est...
We present a new bit-parallel technique for approximate string matching. We build on two previous te...
One of the initial hurdles in taking advantage of big data is the ability to quickly analyze and est...
Abstract. Motivated by the imminent growth of massive, highly redun-dant genomic databases we study ...
We propose a novel string (pattern) matching algorithm called n-gram search. We intend it for the re...
We propose a novel string search algorithm for data stored once and read many times. Our search meth...
We propose a novel string search algorithm for data stored once and read many times. Our search met...
Abstract We present the AS-Index, a new index structure for exact string search in disk resident dat...
Scalable Distributed Data Structures (SDDS) are a class of data structures for multicomputers (a dis...
Abstract. Compressed full-text indexes have been one of pattern matching’s most important success st...
This paper describes a new method of indexing and search-ing large binary signature collections to e...
AbstractPattern matching consists of finding occurrences of a pattern in some data. One general appr...
This article presents a new, memory efficient and cache-optimized algorithm for simultaneously searc...
This article presents a new, memory efficient and cache-optimized algorithm for simultaneously searc...
The standard string matching problem involves finding all occurrences of a single pattern in a singl...
One of the initial hurdles in taking advantage of big data is the ability to quickly analyze and est...
We present a new bit-parallel technique for approximate string matching. We build on two previous te...
One of the initial hurdles in taking advantage of big data is the ability to quickly analyze and est...
Abstract. Motivated by the imminent growth of massive, highly redun-dant genomic databases we study ...