This is a dissertation in three parts, in each we explore the development and analysis of a parallel statistical or machine learning algorithm and its implementation. First, we examine the Assembly Likelihood Evaluation (ALE) framework. This algorithm defines a rigorous statistical likelihood metric used to validate and score genome and metagenome assemblies. This algorithm can be used to identify specific errors within assemblies and their locations; enable comparison between assemblies allowing for optimization of the assembly process; and using re-sequencing data, detect structural variations. Second, we develop an algorithm for Expected Parallel Improvement (EPI). This optimization method allows us to optimally sample many points concur...
The exponential growth of databases that contains biological information (such as protein and DNA da...
Thesis (Ph.D.), School of Electrical Engineering and Computer Science, Washington State UniversityTh...
In the current study we present a parallel statistical algorithm (SHMap), which distinguishes DNA re...
Recent advances in sequencing and synthesis technologies have sparked extraordinary growth in large-...
Finding gene locations for specific functions is an important topic in bioinformatics research that ...
A critical problem for computational genomics is the problem of de novo genome assembly: the develop...
In this paper, we will explore the need of parallelizing bioinformatics algorithms. More specificall...
With growing throughput and dropping cost of High-Throughput Sequencing (HTS) technologies, there is...
Bayesian optimization, a framework for global optimization of expensive-to-evaluate functions, has r...
As technology progresses, the processors used for statistical computation are not getting faster: th...
The genome sequence alignment problems are very important ones from the computational biology perspe...
One of the most ambitious trends in current biomedical research is the large-scale genomic sequencin...
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially...
This document surveys the computational strategies followed to parallelize the most used software in...
Background: The huge quantity of data produced in Biomedical research needs sophisticated algorithmi...
The exponential growth of databases that contains biological information (such as protein and DNA da...
Thesis (Ph.D.), School of Electrical Engineering and Computer Science, Washington State UniversityTh...
In the current study we present a parallel statistical algorithm (SHMap), which distinguishes DNA re...
Recent advances in sequencing and synthesis technologies have sparked extraordinary growth in large-...
Finding gene locations for specific functions is an important topic in bioinformatics research that ...
A critical problem for computational genomics is the problem of de novo genome assembly: the develop...
In this paper, we will explore the need of parallelizing bioinformatics algorithms. More specificall...
With growing throughput and dropping cost of High-Throughput Sequencing (HTS) technologies, there is...
Bayesian optimization, a framework for global optimization of expensive-to-evaluate functions, has r...
As technology progresses, the processors used for statistical computation are not getting faster: th...
The genome sequence alignment problems are very important ones from the computational biology perspe...
One of the most ambitious trends in current biomedical research is the large-scale genomic sequencin...
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially...
This document surveys the computational strategies followed to parallelize the most used software in...
Background: The huge quantity of data produced in Biomedical research needs sophisticated algorithmi...
The exponential growth of databases that contains biological information (such as protein and DNA da...
Thesis (Ph.D.), School of Electrical Engineering and Computer Science, Washington State UniversityTh...
In the current study we present a parallel statistical algorithm (SHMap), which distinguishes DNA re...