Implementation of machine learning algorithms in a distributed environment ensures us multiple advantages, like processing of large datasets and linear speedup with additional processing units. We describe the MapReduce paradigm, which enables distributed computing, and the Disco framework, which implements it. We present the summation form, which is a condition for efficient implementation of algorithms with the MapReduce paradigm, and describe the implementations of the selected algorithms. We propose novel distributed random forest algorithms that build models on subsets of the dataset. We compare time and accuracy of the algorithms with the well recognized data analytics tools. We end our master thesis by describing the integration of t...
Although the support vector machine (SVM) algorithm has a high generalization property for classifyi...
Implementing machine learning algorithms for large data, such as the Web graph and social networks, ...
AbstractIn the big data era, the need for fast robust machine learning techniques is rapidly increas...
Implementation of machine learning algorithms in a distributed environment ensures us multiple advan...
The advent of algorithms capable of leveraging vast quantities of data and computational resources h...
AbstractAs the amount of data generated on a day to day basis is on the uphill the urgency for effic...
Abstract—In this paper, we discuss a Grid data mining system based on the MapReduce paradigm of comp...
ABSTRACTThe rise of big data has led to new demands for machine learning (ML) systems to learn compl...
The rise of big data has led to new demands for machine learning (ML) systems to learn complex model...
The problem of devising models and algorithms for high-performance Distributed Data Mining has tradi...
With the emergence of big data, inducting regression trees on very large data sets became a common d...
We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the origina...
Implementing machine learning algorithms for large data, such as the Web graph and social networks, ...
AbstractDespite technological advances making computing devices faster, smaller, and more prevalent ...
The demand for artificial intelligence has grown significantly over the past decade, and this growth...
Although the support vector machine (SVM) algorithm has a high generalization property for classifyi...
Implementing machine learning algorithms for large data, such as the Web graph and social networks, ...
AbstractIn the big data era, the need for fast robust machine learning techniques is rapidly increas...
Implementation of machine learning algorithms in a distributed environment ensures us multiple advan...
The advent of algorithms capable of leveraging vast quantities of data and computational resources h...
AbstractAs the amount of data generated on a day to day basis is on the uphill the urgency for effic...
Abstract—In this paper, we discuss a Grid data mining system based on the MapReduce paradigm of comp...
ABSTRACTThe rise of big data has led to new demands for machine learning (ML) systems to learn compl...
The rise of big data has led to new demands for machine learning (ML) systems to learn complex model...
The problem of devising models and algorithms for high-performance Distributed Data Mining has tradi...
With the emergence of big data, inducting regression trees on very large data sets became a common d...
We propose a new ensemble algorithm: the meta-boosting algorithm. This algorithm enables the origina...
Implementing machine learning algorithms for large data, such as the Web graph and social networks, ...
AbstractDespite technological advances making computing devices faster, smaller, and more prevalent ...
The demand for artificial intelligence has grown significantly over the past decade, and this growth...
Although the support vector machine (SVM) algorithm has a high generalization property for classifyi...
Implementing machine learning algorithms for large data, such as the Web graph and social networks, ...
AbstractIn the big data era, the need for fast robust machine learning techniques is rapidly increas...