Abstract. Search for sequence similarity in large-scale databases of DNA and protein sequences is one of the essential problems in bioinformatics. To distinguish ran-dom matches from biologically relevant similarities, it is customary to compute statistical P-value of each discov-ered match. In this context, P-value is the probability that a similarity with a given score or higher would appear by chance in a comparison of a random query and a random database. Note that P-value is a function of the database size, since a high-scoring similarity is more likely to exist by chance in a larger database. Biological databases often contain redundant, identical, or very similar sequences. This fact is not taken into account in P-value estimation, r...
This paper describes a method to compress molecular biology databases that are characterized by an i...
Motivation: Comparison of nucleic acid and protein sequences is a fundamental tool of modern bioinfo...
Using DNA to store digital signals, or data in general, offers significant advantages when compared ...
Rapid advancements in research in the field of DNA sequence discovery has led to a vast range of com...
MOTIVATION: Database search programs such as FASTA, BLAST or a rigorous Smith-Waterman algorithm pr...
Matching a biological sequence against a probabilistic pattern (or profile) is a common task in comp...
Biological data mainly comprises of Deoxyribonucleic acid (DNA) and protein sequences. These arethe ...
The increasing volume of biological data requires finding new ways to save these data in genetic ban...
Motivation: Several measures of DNA sequence dissimilarity have beendeveloped.Thepurposeof this pape...
Motivation: Matching a biological sequence against a proba-bilistic pattern (or profile) is a common...
The increase in memory and in network traffic used and caused by new sequenced biological data has r...
Data Storage costs have an appreciable proportion of total cost in the creation and analysis of DNA ...
Abstract—The generation of new databases and the amount of data on existing ones is growing exponent...
Abstract The growing volume of generated DNA sequencing data makes the problem of its ...
Traditionally, intra-sequence similarity is exploited for compressing a single DNA sequence. Recentl...
This paper describes a method to compress molecular biology databases that are characterized by an i...
Motivation: Comparison of nucleic acid and protein sequences is a fundamental tool of modern bioinfo...
Using DNA to store digital signals, or data in general, offers significant advantages when compared ...
Rapid advancements in research in the field of DNA sequence discovery has led to a vast range of com...
MOTIVATION: Database search programs such as FASTA, BLAST or a rigorous Smith-Waterman algorithm pr...
Matching a biological sequence against a probabilistic pattern (or profile) is a common task in comp...
Biological data mainly comprises of Deoxyribonucleic acid (DNA) and protein sequences. These arethe ...
The increasing volume of biological data requires finding new ways to save these data in genetic ban...
Motivation: Several measures of DNA sequence dissimilarity have beendeveloped.Thepurposeof this pape...
Motivation: Matching a biological sequence against a proba-bilistic pattern (or profile) is a common...
The increase in memory and in network traffic used and caused by new sequenced biological data has r...
Data Storage costs have an appreciable proportion of total cost in the creation and analysis of DNA ...
Abstract—The generation of new databases and the amount of data on existing ones is growing exponent...
Abstract The growing volume of generated DNA sequencing data makes the problem of its ...
Traditionally, intra-sequence similarity is exploited for compressing a single DNA sequence. Recentl...
This paper describes a method to compress molecular biology databases that are characterized by an i...
Motivation: Comparison of nucleic acid and protein sequences is a fundamental tool of modern bioinfo...
Using DNA to store digital signals, or data in general, offers significant advantages when compared ...