International audienceApproximate pattern matching is an important computational problem that has a wide range of applications in computational biology and in information retrieval. However, searching a short pattern in a text with high error rates (10–20%) under the Levenshtein distance is a task for which few efficient solutions exist. Here we address this problem by introducing a new type of seeds: the 01⁎0 seeds. These seeds are made of two exact parts separated by parts with exactly one error. We show that those seeds are lossless, and we apply them to two filtration algorithms for two popular applications, one where a compressed index is built on the text and another one where the patterns are indexed. We also demonstrate experimental...
International audienceWe apply the concept of subset seeds proposed in [1] to similarity search in p...
Spaced seeds are a fundamental tool for similarity search in biosequences. The best sensitivity/sele...
International audienceMotivation: Analysis of genetic sequences is usually based on finding similar ...
International audienceApproximate pattern matching is an important computational problem that has a ...
International audienceWe address the problem of approximate pattern matching using the Levenshtein d...
The original publication is available at www.springerlink.comInternational audienceWe study a method...
International audienceWe study a method of seed-based lossless filtration for approximate string mat...
AbstractSpeeding up approximate pattern matching is a line of research in stringology since the 80s....
International audienceSeveral algorithms for similarity search employ seeding techniques to quickly ...
AbstractFiltering is a standard technique for fast approximate string matching in practice. In filte...
AbstractGenomics studies routinely depend on similarity searches based on the strategy of finding sh...
Filtering is a standard technique for fast approximate string matching in practice. In filtering, a ...
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific res...
Speeding up approximate pattern matching is a line of research in stringology since the 80's. ...
Most commonly used similarity search methods in genomic sequences are heuristic ones. These are base...
International audienceWe apply the concept of subset seeds proposed in [1] to similarity search in p...
Spaced seeds are a fundamental tool for similarity search in biosequences. The best sensitivity/sele...
International audienceMotivation: Analysis of genetic sequences is usually based on finding similar ...
International audienceApproximate pattern matching is an important computational problem that has a ...
International audienceWe address the problem of approximate pattern matching using the Levenshtein d...
The original publication is available at www.springerlink.comInternational audienceWe study a method...
International audienceWe study a method of seed-based lossless filtration for approximate string mat...
AbstractSpeeding up approximate pattern matching is a line of research in stringology since the 80s....
International audienceSeveral algorithms for similarity search employ seeding techniques to quickly ...
AbstractFiltering is a standard technique for fast approximate string matching in practice. In filte...
AbstractGenomics studies routinely depend on similarity searches based on the strategy of finding sh...
Filtering is a standard technique for fast approximate string matching in practice. In filtering, a ...
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific res...
Speeding up approximate pattern matching is a line of research in stringology since the 80's. ...
Most commonly used similarity search methods in genomic sequences are heuristic ones. These are base...
International audienceWe apply the concept of subset seeds proposed in [1] to similarity search in p...
Spaced seeds are a fundamental tool for similarity search in biosequences. The best sensitivity/sele...
International audienceMotivation: Analysis of genetic sequences is usually based on finding similar ...