In this digital era data sets are growing rapidly. Storing, processing, and analyzing large volume of data require efficient techniques. These techniques deal with big data problems by providing time efficient methods, effective external memory algorithms, parallel and high performance solutions, and so on. This thesis studies three important areas of big data problems and presents state of the art approaches to address them. The first part of this thesis discusses the k-mer counting problem. A massive number of bioinformatics applications require counting of k-length substrings in genetically important long strings. Genome assembly, repeat detection, multiple sequence alignment, error detection, and many other related applications use a k-...
The rapid growth of data in bioinformatics and biomedical informatics brings new challenges to these...
With the advance of genomic researches, the number of sequences involved in comparative methods has ...
Multiple longest common subsequence (MLCS) mining (a classical NP-hard problem) is an important tas...
In this digital era data sets are growing rapidly. Storing, processing, and analyzing large volume o...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
BackgroundDistributed approaches based on the MapReduce programming paradigm have started to be prop...
Thesis (Ph.D.), Department of Electrical Engineering and Computer Science, Washington State Universi...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
In this dissertation we offer novel algorithms for big data analytics. We live in a period when volu...
Motivation: A major challenge in next-generation genome seque-ncing (NGS) is to assemble massive ove...
Abstract Background Distributed approaches based on the MapReduce programming paradigm have started ...
Alignment-free algorithms are used in bioinformatics to efficiently evaluate the similarity between ...
Information in various applications is often expressed as character sequences over a finite alphabet...
In this dissertation, we worked on several algorithmic problems in bioinformatics using mainly three...
Advancements in biological research have enabled researchers to obtain large amounts of data, especi...
The rapid growth of data in bioinformatics and biomedical informatics brings new challenges to these...
With the advance of genomic researches, the number of sequences involved in comparative methods has ...
Multiple longest common subsequence (MLCS) mining (a classical NP-hard problem) is an important tas...
In this digital era data sets are growing rapidly. Storing, processing, and analyzing large volume o...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
BackgroundDistributed approaches based on the MapReduce programming paradigm have started to be prop...
Thesis (Ph.D.), Department of Electrical Engineering and Computer Science, Washington State Universi...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
In this dissertation we offer novel algorithms for big data analytics. We live in a period when volu...
Motivation: A major challenge in next-generation genome seque-ncing (NGS) is to assemble massive ove...
Abstract Background Distributed approaches based on the MapReduce programming paradigm have started ...
Alignment-free algorithms are used in bioinformatics to efficiently evaluate the similarity between ...
Information in various applications is often expressed as character sequences over a finite alphabet...
In this dissertation, we worked on several algorithmic problems in bioinformatics using mainly three...
Advancements in biological research have enabled researchers to obtain large amounts of data, especi...
The rapid growth of data in bioinformatics and biomedical informatics brings new challenges to these...
With the advance of genomic researches, the number of sequences involved in comparative methods has ...
Multiple longest common subsequence (MLCS) mining (a classical NP-hard problem) is an important tas...