The amount of available data has allowed the field of machine learning to flourish. But with growing data set sizes comes an increase in algorithm execution times. Cluster computing frameworks provide tools for distributing data and processing power on several computer nodes and allows for algorithms to run in feasible time frames when data sets are large. Different cluster computing frameworks come with different trade-offs. In this thesis, the scalability of the execution time of machine learning algorithms running on the Hadoop cluster computing framework is investigated. A recent version of Hadoop and algorithms relevant in industry machine learning, namely K-means, latent Dirichlet allocation and naive Bayes are used in the experiments...
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing...
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing...
We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the origina...
The amount of available data has allowed the field of machine learning to flourish. But with growing...
This bachelor’s thesis compares several tools for building a scalable, machine learning platform and...
Machine Learning (ML) is at the core of data analysis. Machine Learning Algorithms (MLA) are sequent...
There is a huge and rapidly increasing amount of data being generated by social media, mobile applic...
There is a huge and rapidly increasing amount of data being generated by social media, mobile applic...
Abstract—Big data processing is currently becoming in-creasingly important in modern era due to the ...
: Machine learning using large data sets is a computationally intensive process. One technique that ...
Machine learning (ML) is a cornerstone of the new data revolution. Most attempts to scale machine le...
This paper compares parallel and distributed implementations of an iterative, Gibbs sampling, machin...
The course offers basics of analyzing data with machine learning and data mining algorithms in order...
We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to g...
We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to g...
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing...
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing...
We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the origina...
The amount of available data has allowed the field of machine learning to flourish. But with growing...
This bachelor’s thesis compares several tools for building a scalable, machine learning platform and...
Machine Learning (ML) is at the core of data analysis. Machine Learning Algorithms (MLA) are sequent...
There is a huge and rapidly increasing amount of data being generated by social media, mobile applic...
There is a huge and rapidly increasing amount of data being generated by social media, mobile applic...
Abstract—Big data processing is currently becoming in-creasingly important in modern era due to the ...
: Machine learning using large data sets is a computationally intensive process. One technique that ...
Machine learning (ML) is a cornerstone of the new data revolution. Most attempts to scale machine le...
This paper compares parallel and distributed implementations of an iterative, Gibbs sampling, machin...
The course offers basics of analyzing data with machine learning and data mining algorithms in order...
We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to g...
We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to g...
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing...
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing...
We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the origina...