This thesis investigates the impact of sequencing errors in post-sequence computational analyses, including local alignment search and multiple sequence alignment. While the error rates of sequencing technology are commonly reported, the significance of these numbers cannot be fully grasped without putting them in the perspective of their impact on the downstream analyses that are used for biological research, forensics, diagnosis of diseases, etc. I approached the quantification of the impact using fault injection. Faults were injected in the input sequence data, and the analyses were run. Change in the output of the analyses was interpreted as the impact of faults, or errors. Three commonly used algorithms were used: BLAST, SSEARCH, and ...
2012-09-01The development of second-generation sequencing (SGS) technology has provided sci- entists...
BACKGROUND: Multiple sequence alignment (MSA) is an extremely useful tool for molecular and evolutio...
The recent release of twenty-two new genome sequences has dramatically increased the data available ...
Rapid advances in high-throughput sequencing (HTS) technologies have led to an exponential increase ...
Motivation: Bioinformatics tools, such as assemblers and aligners, are expected to produce more accu...
This thesis investigates the accuracy bounds imposed on alignment-based variant calling workflows du...
Background: A feature common to all DNA sequencing technologies is the presence of base-call errors ...
In the biological sciences, sequence analysis refers to analytical investigations that use nucleic a...
[EN] The sequencing market has increased steadily over the last few years, with different approaches...
Tremendous evolvement in sequencing technologies and the vast availability of data due to decreasing...
Error Correction is important for most next-generation sequencing applications because highly accura...
The simple world of algorithms can be applied to various problems all around us. With significant gr...
The Basic Local Alignment Search Tool (BLAST) [ ] algorithm is one of the most commonly used algorit...
Motivation: Bioinformatics tools, such as assemblers and aligners, are expected to produce more accu...
SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data There is ...
2012-09-01The development of second-generation sequencing (SGS) technology has provided sci- entists...
BACKGROUND: Multiple sequence alignment (MSA) is an extremely useful tool for molecular and evolutio...
The recent release of twenty-two new genome sequences has dramatically increased the data available ...
Rapid advances in high-throughput sequencing (HTS) technologies have led to an exponential increase ...
Motivation: Bioinformatics tools, such as assemblers and aligners, are expected to produce more accu...
This thesis investigates the accuracy bounds imposed on alignment-based variant calling workflows du...
Background: A feature common to all DNA sequencing technologies is the presence of base-call errors ...
In the biological sciences, sequence analysis refers to analytical investigations that use nucleic a...
[EN] The sequencing market has increased steadily over the last few years, with different approaches...
Tremendous evolvement in sequencing technologies and the vast availability of data due to decreasing...
Error Correction is important for most next-generation sequencing applications because highly accura...
The simple world of algorithms can be applied to various problems all around us. With significant gr...
The Basic Local Alignment Search Tool (BLAST) [ ] algorithm is one of the most commonly used algorit...
Motivation: Bioinformatics tools, such as assemblers and aligners, are expected to produce more accu...
SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data There is ...
2012-09-01The development of second-generation sequencing (SGS) technology has provided sci- entists...
BACKGROUND: Multiple sequence alignment (MSA) is an extremely useful tool for molecular and evolutio...
The recent release of twenty-two new genome sequences has dramatically increased the data available ...