Our pipeline can be separated into three parts: (i) initial data preparation, (ii) training and prediction, and (iii) model tuning. After (i) initial data preparation, the samples are (ii) semi-randomly (preserving sub-sample ratios) separated into 2 parts, the training/validation set and the test set. After applying fSVA and PCA to the training/validation data, we train supervised SVM or random forest models on the training/validation set. After obtaining the tuned model we make predictions on the test data that has been batch corrected (via fSVA) and rotated (via PCA). This whole process is repeated 60 times to collect statistics on model performance. For model tuning (iii), the training/validation data set is similarly divided semi-rando...
Automated machine learning pipeline (ML) composition and optimisation aim at automating the process ...
Abstract: Machine Learning generates programs that make predictions and informed decisions about com...
<p>The original dataset was randomly split into an optimization and experimental datasets. The forme...
A count matrix undergoes pre-processing, including normalization and filtering. The data is randomly...
The process of FS and classification consists of the following steps: 1) create 100 random splits of...
Machine learning algorithms are used to train the machine to learn on its own and improve from exper...
Master's thesis in Computer scienceWith the advent of the era of big data, machine learning has been...
This thesis explores one of the most fundamental questions in Machine Learning, namely, how should t...
<p>We trained a classifier to predict phase III clinical trial outcomes, using 5-fold cross-validati...
The three extreme gradient boosting-based models were built through parameter optimization using the...
The data was split temporally into a training/validation dataset (2016) and testing dataset (2017). ...
Much research has been conducted in the area of machine learning algorithms; however, the question o...
The tuning of learning algorithm parameters has become more and more important during the last years...
In the context of deep learning, the more expensive computational phase is the full training of the ...
Machine learning (ML) pipeline composition and optimisation have been studied to seek multi-stage ML...
Automated machine learning pipeline (ML) composition and optimisation aim at automating the process ...
Abstract: Machine Learning generates programs that make predictions and informed decisions about com...
<p>The original dataset was randomly split into an optimization and experimental datasets. The forme...
A count matrix undergoes pre-processing, including normalization and filtering. The data is randomly...
The process of FS and classification consists of the following steps: 1) create 100 random splits of...
Machine learning algorithms are used to train the machine to learn on its own and improve from exper...
Master's thesis in Computer scienceWith the advent of the era of big data, machine learning has been...
This thesis explores one of the most fundamental questions in Machine Learning, namely, how should t...
<p>We trained a classifier to predict phase III clinical trial outcomes, using 5-fold cross-validati...
The three extreme gradient boosting-based models were built through parameter optimization using the...
The data was split temporally into a training/validation dataset (2016) and testing dataset (2017). ...
Much research has been conducted in the area of machine learning algorithms; however, the question o...
The tuning of learning algorithm parameters has become more and more important during the last years...
In the context of deep learning, the more expensive computational phase is the full training of the ...
Machine learning (ML) pipeline composition and optimisation have been studied to seek multi-stage ML...
Automated machine learning pipeline (ML) composition and optimisation aim at automating the process ...
Abstract: Machine Learning generates programs that make predictions and informed decisions about com...
<p>The original dataset was randomly split into an optimization and experimental datasets. The forme...