The Empirical Variance Estimator for Computer Aided Diagnosis: Lessons for Algorithm Validation
Computer aided diagnosis is an established field in medical image analysis; a great deal of effort goes into the development and refinement of pipelines to achieve greater performance. This improvement is dependent on reliable comparison, which is intimately related to variance estimation. For supervised methods, this can be confounded by statistical issues at the comparatively small sample sizes typical of the field. Given the importance of reliable comparison to pipeline development, this issue has received relatively little attention. As a solution, we advocate an empirical variance estimator based on validation within disjoint subsets of the available data. Using Alzheimer’s disease classification in the ADNI dataset as an examplar, we investigate the behaviour of different variance estimators in a series of resampling experiments. We show that the proposed estimator is unbiased, and that it exceeds the estimates of naive approaches, which are biased down. Because the estimator avoids independence assumptions, it is able to accommodate arbitrary validation strategies and performance metrics. As it is unbiased, it is able to provide statistically convincing comparison and confidence intervals for algorithm performance. Finally, we show how the estimator can be used to compare different validation strategies, and make some recommendations about which should be used.
KeywordsCross Validation Variance Estimator Unbiased Estimator Validation Strategy Medical Image Analysis
- 1.Cuingnet, R., Gerardin, E., Tessieras, J., Auzias, G., Lehéricy, S., Habert, M.O., Chupin, M., Benali, H., Colliot, O.: Automatic classification of patients with alzheimer’s disease from structural MRI: a comparison of ten methods using the adni database. Neuroimage 56(2), 766–781 (2011)CrossRefGoogle Scholar
- 4.Noirhomme, Q., Lesenfants, D., Gomez, F., Soddu, A., Schrouff, J., Garraux, G., Luxen, A., Phillips, C., Laureys, S.: Biased binomial assessment of cross-validated estimation of classification accuracies illustrated in diagnosis predictions. NeuroImage: Clinical 4, 687–694 (2014)CrossRefGoogle Scholar
- 6.Grandvalet, Y., Bengio, Y.: Hypothesis testing for cross-validation. Montreal Universite de Montreal, Operationnelle DdIeR (2006)Google Scholar
- 8.Cardoso, M.J., Leung, K., Modat, M., Keihaninejad, S., Cash, D., Barnes, J., Fox, N.C., Ourselin, S.: STEPS: Similarity and truth estimation for propagated segmentations and its application to hippocampal segmentation and brain parcelation. Medical Image Analysis 17(6), 671–684 (2013)CrossRefGoogle Scholar