Abstract Decision tree (and its extensions such as Gradient Boosting Decision Trees and Random Forest) is a widely used machine learning algorithm, due to its practical effectiveness and model interpretability. With the emergence of big data, there is an increasing need to parallelize the training process of decision tree. However, most existing attempts along this line suffer from high communication costs. In this paper, we propose a new algorithm, called Parallel Voting Decision Tree (PV-Tree), to tackle this challenge. After partitioning the training data onto a number of (e.g., M ) machines, this algorithm performs both local voting and global voting in each iteration. For local voting, the top-k attributes are selected from each machin...
Machine learning algorithms are now being deployed in practically all areas of our lives. Part of th...
Machine-learnt models based on additive ensembles of regression trees are currently deemed the best ...
Abstract—Decision tree construction is a well-studied data mining problem. In this paper, we focus o...
This paper presents a study that discusses how multi-threading can be used to improve the runtime pe...
Abstract. In the fields of data mining and machine learning the amount of data available for buildin...
Univariate decision tree algorithms are widely used in Data Mining because (i) they are easy to lear...
When running data-mining algorithms on big data platforms, a parallel, distributed framework, such a...
Abstract. Decision trees are one of the most effective and widely used induction methods that have r...
Abstract. One of the important and still not fully addressed issues in evolving decision trees is th...
Abstract. In most of data mining systems decision trees are induced in a top-down manner. This greed...
We present an algorithm designed to efficiently construct a decision tree over heterogeneously distr...
Data mining refers to the process of finding hidden patterns inside a large dataset. While improving...
Several algorithms have been proposed in the literature for building decision trees (DT) for large d...
Learning decision trees against very large amounts of data is not practical on single node computer...
State-of-the-art decision tree methods apply heuristics recursively to create each split in isolatio...
Machine learning algorithms are now being deployed in practically all areas of our lives. Part of th...
Machine-learnt models based on additive ensembles of regression trees are currently deemed the best ...
Abstract—Decision tree construction is a well-studied data mining problem. In this paper, we focus o...
This paper presents a study that discusses how multi-threading can be used to improve the runtime pe...
Abstract. In the fields of data mining and machine learning the amount of data available for buildin...
Univariate decision tree algorithms are widely used in Data Mining because (i) they are easy to lear...
When running data-mining algorithms on big data platforms, a parallel, distributed framework, such a...
Abstract. Decision trees are one of the most effective and widely used induction methods that have r...
Abstract. One of the important and still not fully addressed issues in evolving decision trees is th...
Abstract. In most of data mining systems decision trees are induced in a top-down manner. This greed...
We present an algorithm designed to efficiently construct a decision tree over heterogeneously distr...
Data mining refers to the process of finding hidden patterns inside a large dataset. While improving...
Several algorithms have been proposed in the literature for building decision trees (DT) for large d...
Learning decision trees against very large amounts of data is not practical on single node computer...
State-of-the-art decision tree methods apply heuristics recursively to create each split in isolatio...
Machine learning algorithms are now being deployed in practically all areas of our lives. Part of th...
Machine-learnt models based on additive ensembles of regression trees are currently deemed the best ...
Abstract—Decision tree construction is a well-studied data mining problem. In this paper, we focus o...