In this cumulative dissertation thesis, I examine the influence of hyperparameters on machine learning algorithms, with a special focus on random forest. It mainly consists of three papers that were written in the last three years. The first paper (Probst and Boulesteix, 2018) examines the influence of the number of trees on the performance of a random forest. In general it is believed that the number of trees should be set higher to achieve better performance. However, we show some real data examples in which the expectation of measures such as accuracy and AUC (partially) decrease with growing numbers of trees. We prove theoretically why this can happen and argue that this only happens in very special data situations. For other measure...
The ensemble method random forests has become a popular classification tool in bioinformatics and re...
Random Forests (RF) of tree classifiers are a popular ensemble method for classification. RF have sh...
Random forests are a very effective and commonly used statistical method, but their full theoretical...
In this cumulative dissertation thesis, I examine the influence of hyperparameters on machine learni...
In this paper we present our work on the Random Forest (RF) family of classification methods. Our go...
In order to create a machine learning model, one is often tasked with selecting certain hyperparamet...
Breiman's (2001) random forests are a very popular class of learning algorithms often able to produc...
Breiman (2001a,b) has recently developed an ensemble classification and regression approach that dis...
International audienceIn this paper, we present a non-deterministic strategy for searching for optim...
Hyperparameters in machine learning (ML) have received a fair amount of attention, and hyperparamete...
The performance of many machine learning meth-ods depends critically on hyperparameter set-tings. So...
Machine-learning algorithms have gained popularity in recent years in the field of ecological modeli...
International audienceRandom forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (...
Hyperparameter tuning: Random Forest accuracy scores for multiple numbers of trees on the US-FD1W-25...
Recent studies have expanded the focus of machine learning methods like random forests beyond predic...
The ensemble method random forests has become a popular classification tool in bioinformatics and re...
Random Forests (RF) of tree classifiers are a popular ensemble method for classification. RF have sh...
Random forests are a very effective and commonly used statistical method, but their full theoretical...
In this cumulative dissertation thesis, I examine the influence of hyperparameters on machine learni...
In this paper we present our work on the Random Forest (RF) family of classification methods. Our go...
In order to create a machine learning model, one is often tasked with selecting certain hyperparamet...
Breiman's (2001) random forests are a very popular class of learning algorithms often able to produc...
Breiman (2001a,b) has recently developed an ensemble classification and regression approach that dis...
International audienceIn this paper, we present a non-deterministic strategy for searching for optim...
Hyperparameters in machine learning (ML) have received a fair amount of attention, and hyperparamete...
The performance of many machine learning meth-ods depends critically on hyperparameter set-tings. So...
Machine-learning algorithms have gained popularity in recent years in the field of ecological modeli...
International audienceRandom forests are a learning algorithm proposed by Breiman [Mach. Learn. 45 (...
Hyperparameter tuning: Random Forest accuracy scores for multiple numbers of trees on the US-FD1W-25...
Recent studies have expanded the focus of machine learning methods like random forests beyond predic...
The ensemble method random forests has become a popular classification tool in bioinformatics and re...
Random Forests (RF) of tree classifiers are a popular ensemble method for classification. RF have sh...
Random forests are a very effective and commonly used statistical method, but their full theoretical...