Abstract
There is a large effort to construct algorithms performing objective, reproducible segmentations of medical image data. However, the introduction of these techniques into clinical practice has been hampered by the lack of thorough evaluation of performance. If the gold standard is defined by human performance, a full validation study is costly, since it should comprise both a large number of datasets and a large number of trained medical experts to inspect these. In this case one should compare the weighted average of the observers with the result of the algorithm. An algorithm performs well enough if it is closer to the average than the variance of the medical experts. Three important steps in this procedure are selecting the parameter to be compared, determining a gold standard (the weighted average) and an error metric to define how much an individual result differs from the gold standard. The different steps of this procedure are illustrated by a clinical evaluation of different techniques for segmenting intravascular ultrasound images.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Besl, P. and McKay, N.D. (1997) A method for registration of 3-D shapes, IEEE Trans. on Pattern Recognition and Machine Intelligence, 14: 239–256.
Bouma, C.J., Niessen, W.J., Zuiderveld, K.J., Gussenhoven, E.J. and Viergever, M.A. (1997) Automated lumen definition from 30 MHz intravascular ultrasound images, Medical Image Analysis, 1: 363–377.
Chalana, V. and Kim, Y. (1996) A methodology for evaluation of boundary detection algorithms on medical images, IEEE Transactions on Medical Imaging, 16: 642–652
Herman, G.T., Zheng, J. and Bucholtz, C.A. (1992) Shape-based interpolation, IEEE Computer Graphics and Applications, 12: 69–79.
Niessen, W.J., Vincken, K.L. Weickert, J., ter Haar Romeny, B.M. and Viergever, M.A. (1999) Multiscale segmentation of three-dimensional MR brain images, International Journal of Computer Vision, 31: 185–202.
Raya, S.P. and Udupa, J.K. (1990) Shape-based interpolation of multidimensional objects, IEEE Transactions on Medical Imaging, 9: 32–42.
Williams, G.W. (1976) Comparing the joint agreement of several raters with another rater, Biometrics, 32: 619–627.
Zijdenbos, A. and Dawant, B.M. and Margolin, R.A. and Palmer, A.C. (1994) Morpho-metric analysis of white matter lesions in MR images, IEEE Transactions on Medical Imaging, 14: 716–724.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Niessen, W.J., Bouma, C.J., Vincken, K.L., Viergever, M.A. (2000). Error Metrics for Quantitative Evaluation of Medical Image Segmentation. In: Klette, R., Stiehl, H.S., Viergever, M.A., Vincken, K.L. (eds) Performance Characterization in Computer Vision. Computational Imaging and Vision, vol 17. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-9538-4_22
Download citation
DOI: https://doi.org/10.1007/978-94-015-9538-4_22
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5487-6
Online ISBN: 978-94-015-9538-4
eBook Packages: Springer Book Archive