Motivation: Efficient clustering is important for handling the large amount of available EST sequences. Most con-temporary methods are based on some kind of all-against-all comparison, resulting in a quadratic time complexity. A different approach is needed to keep up with the rapid growth of EST data. Results: A new, fast EST clustering algorithm is pre-sented. Sub-quadratic time complexity is achieved by using an algorithm based on suffix arrays. A prototype implementation has been developed and run on a bench-mark data set. The produced clusterings are validated by comparing them to clusterings produced by other methods, and the results are quite promising. Availability: The source code for the prototype implemen-tation is available unde...
The rapid development of sequencing technology has led to an explosive accumulation of genomic seque...
Clustering is a group of (unsupervised) machine learning algorithms used to categorize data into clu...
<div><p>The rapid development of sequencing technology has led to an explosive accumulation of genom...
Philosophiae Doctor - PhDSummary: Expressed sequence tag database is a rich and fast growing source ...
Philosophiae Doctor - PhDExpressed sequence tag database is a rich and fast growing source of data f...
We present a fast algorithm for sequence clustering and searching which works with large sequence da...
EST clustering is a simple, yet effective method to discover all the genes present in a variety of s...
In recent years, we have seen an enormous growth in the amount of available commercial and scientifi...
Expressed sequence tags, abbreviated ESTs, are DNA molecules experimentally derived from expressed p...
AbstractThe suffix array is a data structure that finds numerous applications in string processing p...
Our work involves developing an intelligent, time- and memory-efficient parallel clustering algorith...
Background: We propose a sequence clustering algorithm and compare the partition quality and executi...
Motivation: Second-generation sequencing technology has reinvigorated research using expression data...
Real-time sequence clustering is the problem of clustering an infinite stream of sequences in real t...
Background Expressed sequence tags (ESTs) are single pass reads from randomly selected cDNA clones. ...
The rapid development of sequencing technology has led to an explosive accumulation of genomic seque...
Clustering is a group of (unsupervised) machine learning algorithms used to categorize data into clu...
<div><p>The rapid development of sequencing technology has led to an explosive accumulation of genom...
Philosophiae Doctor - PhDSummary: Expressed sequence tag database is a rich and fast growing source ...
Philosophiae Doctor - PhDExpressed sequence tag database is a rich and fast growing source of data f...
We present a fast algorithm for sequence clustering and searching which works with large sequence da...
EST clustering is a simple, yet effective method to discover all the genes present in a variety of s...
In recent years, we have seen an enormous growth in the amount of available commercial and scientifi...
Expressed sequence tags, abbreviated ESTs, are DNA molecules experimentally derived from expressed p...
AbstractThe suffix array is a data structure that finds numerous applications in string processing p...
Our work involves developing an intelligent, time- and memory-efficient parallel clustering algorith...
Background: We propose a sequence clustering algorithm and compare the partition quality and executi...
Motivation: Second-generation sequencing technology has reinvigorated research using expression data...
Real-time sequence clustering is the problem of clustering an infinite stream of sequences in real t...
Background Expressed sequence tags (ESTs) are single pass reads from randomly selected cDNA clones. ...
The rapid development of sequencing technology has led to an explosive accumulation of genomic seque...
Clustering is a group of (unsupervised) machine learning algorithms used to categorize data into clu...
<div><p>The rapid development of sequencing technology has led to an explosive accumulation of genom...