There are many hundreds of fault prediction models published in the literature. The predictive performance of these models is often reported using a variety of different measures. Most performance measures are not directly comparable. This lack of comparability means that it is often difficult to evaluate the performance of one model against another. Our aim is to present an approach that allows other researchers and practitioners to transform many performance measures of categorical studies back into a confusion matrix. Once performance is expressed in a confusion matrix alternative preferred performance measures can then be derived. Our approach has enabled us to compare the performance of 600 models published in 42 studies. We demonstrat...
Various approaches are presented in the literature to identify fault-prone components. The approache...
(a) Recovery rates across families. Rows represent true models/families (used to simulate data), and...
<p>(A), Confusion matrix of the model performance on the dataset of 15 scenes. The average accuracy ...
There are many hundreds of fault prediction models published in the literature. The predictive perfo...
Awarded Best Paper in Conference. "© ACM, 2012. This is the author's version of the work. It is post...
There are many hundreds of fault prediction models published in the literature. The predictive perfo...
In the field of cognitive science, the primary means of judging a model’s viability is made on the b...
Background: The accurate prediction of where faults are likely to occur in code can help direct test...
This is a post-print of the article accepted for publication. The definitive version can be accessed...
<p>Confusion matrix and performance estimates of the multiclass prediction model.</p
Software defect prediction performance varies over a large range. Menzies suggested there is a ceili...
A common set of statistical metrics has been used to summarize the performance of models or measurem...
peer-reviewedBackground: The accurate prediction of where faults are likely to occur in code can hel...
This note briefly describes some important performance measures that can be used in failure predicti...
Software defect prediction research has adopted various evaluation measures to assess the performanc...
Various approaches are presented in the literature to identify fault-prone components. The approache...
(a) Recovery rates across families. Rows represent true models/families (used to simulate data), and...
<p>(A), Confusion matrix of the model performance on the dataset of 15 scenes. The average accuracy ...
There are many hundreds of fault prediction models published in the literature. The predictive perfo...
Awarded Best Paper in Conference. "© ACM, 2012. This is the author's version of the work. It is post...
There are many hundreds of fault prediction models published in the literature. The predictive perfo...
In the field of cognitive science, the primary means of judging a model’s viability is made on the b...
Background: The accurate prediction of where faults are likely to occur in code can help direct test...
This is a post-print of the article accepted for publication. The definitive version can be accessed...
<p>Confusion matrix and performance estimates of the multiclass prediction model.</p
Software defect prediction performance varies over a large range. Menzies suggested there is a ceili...
A common set of statistical metrics has been used to summarize the performance of models or measurem...
peer-reviewedBackground: The accurate prediction of where faults are likely to occur in code can hel...
This note briefly describes some important performance measures that can be used in failure predicti...
Software defect prediction research has adopted various evaluation measures to assess the performanc...
Various approaches are presented in the literature to identify fault-prone components. The approache...
(a) Recovery rates across families. Rows represent true models/families (used to simulate data), and...
<p>(A), Confusion matrix of the model performance on the dataset of 15 scenes. The average accuracy ...