Due to the increased availability of large datasets of biological sequences, tools for sequence comparison are now relying on efficient alignment-free approaches to a greater extent. Most alignment-free approaches require the computation of statistics when comparing sequences, even if such computations may not scale well in in internal memory when very large collections of long sequences are considered. In this paper, we present a new conceptual data structure, the colored longest common prefix array (cLCP), that allows to efficiently tackle several problems with an alignment-free approach. In fact, we show that such a data structure can be computed via sequential scans in semi-external memory. By using cLCP, we propose an efficient lightwe...
Sequencing technologies produce larger and larger collections of biosequences that have to be stored...
This paper outlines the design of a bit-parallel, multi-string algorithm for high-similarity string ...
AbstractMany sequence analysis tasks can be accomplished with a suffix array, and several of them ad...
Due to the increased availability of large datasets of biological sequences, tools for sequence comp...
The longest common prefix array is a very advantageous data structure that, combined with the suffix...
International audienceMotivation: Alignment-based methods for sequence analysis have various limitat...
When augmented with the longest common prefix (LCP) array and some other structures, the suffix arra...
The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hund...
The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hund...
Information in various applications is often expressed as character sequences over a finite alphabet...
The suffix array, a space efficient alternative to the suffix tree, is an important data structure f...
Finding the longest common subsequence (LCS) of multiple strings is an NP-hard problem, with many ...
Abstract—Suffix arrays and the corresponding longest com-mon prefix (LCP) array have wide applicatio...
The constrained longest common subsequence (CLCS) problem was introduced as a specific measure of si...
In this paper. we show new parallel algorithms for a set of classical string comparison problems. co...
Sequencing technologies produce larger and larger collections of biosequences that have to be stored...
This paper outlines the design of a bit-parallel, multi-string algorithm for high-similarity string ...
AbstractMany sequence analysis tasks can be accomplished with a suffix array, and several of them ad...
Due to the increased availability of large datasets of biological sequences, tools for sequence comp...
The longest common prefix array is a very advantageous data structure that, combined with the suffix...
International audienceMotivation: Alignment-based methods for sequence analysis have various limitat...
When augmented with the longest common prefix (LCP) array and some other structures, the suffix arra...
The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hund...
The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hund...
Information in various applications is often expressed as character sequences over a finite alphabet...
The suffix array, a space efficient alternative to the suffix tree, is an important data structure f...
Finding the longest common subsequence (LCS) of multiple strings is an NP-hard problem, with many ...
Abstract—Suffix arrays and the corresponding longest com-mon prefix (LCP) array have wide applicatio...
The constrained longest common subsequence (CLCS) problem was introduced as a specific measure of si...
In this paper. we show new parallel algorithms for a set of classical string comparison problems. co...
Sequencing technologies produce larger and larger collections of biosequences that have to be stored...
This paper outlines the design of a bit-parallel, multi-string algorithm for high-similarity string ...
AbstractMany sequence analysis tasks can be accomplished with a suffix array, and several of them ad...