Background: We propose a sequence clustering algorithm and compare the partition quality and execution time of the proposed algorithm with those of a popular existing algorithm. The proposed clustering algorithm uses a grammar-based distance metric to determine partitioning for a set of biological sequences. The algorithm performs clustering in which new sequences are compared with cluster-representative sequences to determine membership. If comparison fails to identify a suitable cluster, a new cluster is created. Results: The performance of the proposed algorithm is validated via comparison to the popular DNA/RNA sequence clustering approach, CD-HIT-EST, and to the recently developed algorithm, UCLUST, using two different sets of 16S rDNA...
BACKGROUND:The sequencing of the human genome has enabled us to access a comprehensive list of genes...
Cluster analysis or clustering is an important data mining technique widely used for pattern recogni...
We present a fast algorithm for sequence clustering and searching which works with large sequence da...
Background: We propose a sequence clustering algorithm and compare the partition quality and executi...
MotivationSimilarity clustering of next-generation sequences (NGS) is an important computational pro...
Comparing a string to a large set of sequences is a key subroutine in greedy heuristics for clusteri...
The article describes two new clustering algorithms for DNA nucleotide sequences, summarizes the res...
The article describes two new clustering algorithms for DNA nucleotide sequences, summarizes the res...
Philosophiae Doctor - PhDSummary: Expressed sequence tag database is a rich and fast growing source ...
Background: Clustering is a fundamental operation in the analysis of biological sequence data. New D...
Motivation: Nucleotide sequence data are being produced at an ever increasing rate. Clustering such ...
Background: Many problems in computational biology require alignment-free sequence comparisons. One ...
To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often a...
This paper explores clustering algorithms to construct a phylogenetic tree, based on distance measur...
Philosophiae Doctor - PhDExpressed sequence tag database is a rich and fast growing source of data f...
BACKGROUND:The sequencing of the human genome has enabled us to access a comprehensive list of genes...
Cluster analysis or clustering is an important data mining technique widely used for pattern recogni...
We present a fast algorithm for sequence clustering and searching which works with large sequence da...
Background: We propose a sequence clustering algorithm and compare the partition quality and executi...
MotivationSimilarity clustering of next-generation sequences (NGS) is an important computational pro...
Comparing a string to a large set of sequences is a key subroutine in greedy heuristics for clusteri...
The article describes two new clustering algorithms for DNA nucleotide sequences, summarizes the res...
The article describes two new clustering algorithms for DNA nucleotide sequences, summarizes the res...
Philosophiae Doctor - PhDSummary: Expressed sequence tag database is a rich and fast growing source ...
Background: Clustering is a fundamental operation in the analysis of biological sequence data. New D...
Motivation: Nucleotide sequence data are being produced at an ever increasing rate. Clustering such ...
Background: Many problems in computational biology require alignment-free sequence comparisons. One ...
To analyze complex biodiversity in microbial communities, 16S rRNA marker gene sequences are often a...
This paper explores clustering algorithms to construct a phylogenetic tree, based on distance measur...
Philosophiae Doctor - PhDExpressed sequence tag database is a rich and fast growing source of data f...
BACKGROUND:The sequencing of the human genome has enabled us to access a comprehensive list of genes...
Cluster analysis or clustering is an important data mining technique widely used for pattern recogni...
We present a fast algorithm for sequence clustering and searching which works with large sequence da...