We consider the problem of indexing a text T (of length n) with a light data structure that supports efficient search of patterns P (of length m) allowing errors under the Hamming distance. We propose a hash-based strategy that employs two classes of hash functions—dubbed Hamming-aware and de Bruijn—to drastically reduce search space and memory footprint of the index, respectively. We use our succinct hash data structure to solve the k-mismatch search problem in 2n log σ + o(n log σ) bits of space with a random- ized algorithm having smoothed complexity O((2σ)k(log n)k(log m+ξ)+ (occ + 1) · m), where σ is the alphabet size, occ is the number of occur- rences, and ξ is a term depending on m, n, and on the amplitude ε of the noise perturbing...
Approximate string matching is about finding a given string pattern in a text by allowing some degre...
In this paper we design two compressed data structures for the full-text indexing problem. These da...
Abstract—There is growing interest in representing image data and feature descriptors using compact ...
We consider the problem of indexing a text T (of length n) with a light data structure that supports...
This paper revisits the problem of indexing a text S[1.,n] to support searching substrings in S that...
Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabe...
This paper addresses the problem of ultra-large-scale search in Hamming spaces. There has been consi...
AbstractThis paper revisits the problem of indexing a text S[1..n] for pattern matching with up to k...
The high throughput of modern NGS sequencers coupled with the huge sizes of genomes currently analys...
There has been growing interest in mapping image data onto compact binary codes for fast near neighb...
Similarity preserving hashing can aid forensic investigations by providing means to recognize known ...
We revisit the problem of indexing a string S[1..n] to support finding all substrings in S that matc...
There is growing interest in representing image data and feature descriptors using compact binary co...
[[abstract]]This paper revisits the problem of indexing a text for approximate string matching. Spec...
International audienceIn this paper we study lower bounds for the fundamental problem of text indexi...
Approximate string matching is about finding a given string pattern in a text by allowing some degre...
In this paper we design two compressed data structures for the full-text indexing problem. These da...
Abstract—There is growing interest in representing image data and feature descriptors using compact ...
We consider the problem of indexing a text T (of length n) with a light data structure that supports...
This paper revisits the problem of indexing a text S[1.,n] to support searching substrings in S that...
Let T be a text of length n and P be a pattern of length m, both strings over a fixed finite alphabe...
This paper addresses the problem of ultra-large-scale search in Hamming spaces. There has been consi...
AbstractThis paper revisits the problem of indexing a text S[1..n] for pattern matching with up to k...
The high throughput of modern NGS sequencers coupled with the huge sizes of genomes currently analys...
There has been growing interest in mapping image data onto compact binary codes for fast near neighb...
Similarity preserving hashing can aid forensic investigations by providing means to recognize known ...
We revisit the problem of indexing a string S[1..n] to support finding all substrings in S that matc...
There is growing interest in representing image data and feature descriptors using compact binary co...
[[abstract]]This paper revisits the problem of indexing a text for approximate string matching. Spec...
International audienceIn this paper we study lower bounds for the fundamental problem of text indexi...
Approximate string matching is about finding a given string pattern in a text by allowing some degre...
In this paper we design two compressed data structures for the full-text indexing problem. These da...
Abstract—There is growing interest in representing image data and feature descriptors using compact ...