In practical applications of supervised statistical learning the separation of the training and test data is often violated through performing one or several analysis steps prior to estimating the prediction error by cross-validation (CV) procedures. We refer to such practices as incomplete CV. For the special case of preliminary variable selection in high-dimensional microarray data the corresponding error estimate is well known to be strongly downwardly biased, resulting in over-optimistic conclusions regarding prediction accuracy of the fitted models. However, while other data preparation steps may also be affected by these types of problems, their impact on error estimation is far less acknowledged in the literature. In this paper we sh...
Given the relatively small number of microarrays typically used in gene-expression-based classificat...
We consider the mean prediction error of a classification or regression procedure as well as its cro...
High-dimensional binary classification tasks, e.g. the classification of microarray samples into nor...
In practical applications of supervised statistical learning the separation of the training and test...
In practical applications of supervised statistical learning the separation of the training and test...
In practical applications of supervised statistical learning the separation of the training and test...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Motivation: Numerous competing algorithms for prediction in high-dimensional settings have been deve...
Motivation: Numerous competing algorithms for prediction in high-dimensional settings have been deve...
Abstract Background Cross-validation (CV) is an effective method for estimating the prediction error...
Abstract Background To estimate a classifier’s error in predicting future observations, bootstrap me...
Machine learning is largely an experimental science, of which the evaluation of predictive models is...
Given the relatively small number of microarrays typically used in gene-expression-based classificat...
We consider the mean prediction error of a classification or regression procedure as well as its cro...
High-dimensional binary classification tasks, e.g. the classification of microarray samples into nor...
In practical applications of supervised statistical learning the separation of the training and test...
In practical applications of supervised statistical learning the separation of the training and test...
In practical applications of supervised statistical learning the separation of the training and test...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Background In applications of supervised statistical learning in the biomedical field it is necessa...
Motivation: Numerous competing algorithms for prediction in high-dimensional settings have been deve...
Motivation: Numerous competing algorithms for prediction in high-dimensional settings have been deve...
Abstract Background Cross-validation (CV) is an effective method for estimating the prediction error...
Abstract Background To estimate a classifier’s error in predicting future observations, bootstrap me...
Machine learning is largely an experimental science, of which the evaluation of predictive models is...
Given the relatively small number of microarrays typically used in gene-expression-based classificat...
We consider the mean prediction error of a classification or regression procedure as well as its cro...
High-dimensional binary classification tasks, e.g. the classification of microarray samples into nor...