In this paper, a novel technique for parallelizing data-classification problems is applied to finding genes in sequences of DNA. The technique involves various ensem- ble classification methods such as Bagging and Select Best. It then distributes the classifier training and prediction using MapReduce. A novel sequence classification voting algorithm is evaluated in the Bagging method, as well as compared against the Select Best method
© 2016 Anaissi et al. This is an open access article distributed under the terms of the Creative Com...
HPC (high perfomance computing) based on clusters of multicores is one of the main research lines in...
We present research on the design, development and application of algorithms for DNA sequence analys...
In this paper, a novel technique for parallelizing data-classification problems is applied to findin...
This is a dissertation in three parts, in each we explore the development and analysis of a parallel...
The gene microarray analysis and classification have demonstrated an effective way for the effective...
The scope and scale of biological data continues to grow at an exponential clip, driven by advances ...
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the ...
A DNA sequence analysis parallelization in large databases using cluster, multi-cluster, and GRID is...
As cost and throughput of second-generation sequencers continue to improve, even modestly resourced ...
Unsupervised ensemble learning refers to methods devised for a particular task that combine data pro...
peer reviewedThis paper presents a new algorithm based on the segment and combine paradigm, for auto...
Ensemble learning is an intensively studies technique in machine learning and pattern recognition. R...
Abstract—Machine learning is a data processing technology that uses training data to help make judgm...
Gene expression profiling has emerged as an efficient technique for classification, diagnosis and tr...
© 2016 Anaissi et al. This is an open access article distributed under the terms of the Creative Com...
HPC (high perfomance computing) based on clusters of multicores is one of the main research lines in...
We present research on the design, development and application of algorithms for DNA sequence analys...
In this paper, a novel technique for parallelizing data-classification problems is applied to findin...
This is a dissertation in three parts, in each we explore the development and analysis of a parallel...
The gene microarray analysis and classification have demonstrated an effective way for the effective...
The scope and scale of biological data continues to grow at an exponential clip, driven by advances ...
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the ...
A DNA sequence analysis parallelization in large databases using cluster, multi-cluster, and GRID is...
As cost and throughput of second-generation sequencers continue to improve, even modestly resourced ...
Unsupervised ensemble learning refers to methods devised for a particular task that combine data pro...
peer reviewedThis paper presents a new algorithm based on the segment and combine paradigm, for auto...
Ensemble learning is an intensively studies technique in machine learning and pattern recognition. R...
Abstract—Machine learning is a data processing technology that uses training data to help make judgm...
Gene expression profiling has emerged as an efficient technique for classification, diagnosis and tr...
© 2016 Anaissi et al. This is an open access article distributed under the terms of the Creative Com...
HPC (high perfomance computing) based on clusters of multicores is one of the main research lines in...
We present research on the design, development and application of algorithms for DNA sequence analys...