Abstract
In this paper we analyze the properties of the well-known segmentation fusion algorithm STAPLE, using a novel inference technique that analytically marginalizes out all model parameters. We demonstrate both theoretically and empirically that when the number of raters is large, or when consensus regions are included in the model, STAPLE devolves into thresholding the average of the input segmentations. We further show that when the number of raters is small, the STAPLE result may not be the optimal segmentation truth estimate, and its model parameter estimates might not reflect the individual raters’ actual segmentation performance. Our experiments indicate that these intrinsic weaknesses are frequently exacerbated by the presence of undesirable global optima and convergence issues. Together these results cast doubt on the soundness and usefulness of typical STAPLE outcomes.
Chapter PDF
References
Warfield, S.K., et al.: Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE TMI 23(7), 903–921 (2004)
Commowick, O., et al.: Estimating a reference standard segmentation with spatially varying performance parameters: Local MAP STAPLE. IEEE TMI 31(8), 1593–1606 (2012)
Asman, A.J., Landman, B.A.: Formulating spatially varying performance in the statistical fusion framework. IEEE TMI 31(6), 1326–1336 (2012)
Commowick, O., Warfield, S.K.: Estimation of inferential uncertainty in assessing expert segmentation performance from STAPLE. IEEE TMI 29(3), 771–780 (2010)
Landman, B., et al.: Robust statistical fusion of image labels. IEEE TMI 31(2), 512–522 (2012)
Langerak, T.R., et al.: Label fusion in atlas-based segmentation using a selective and iterative method for performance level estimation (SIMPLE). IEEE TMI 29(12), 2000–2008 (2010)
Sabuncu, M.R., et al.: A generative model for image segmentation based on label fusion. IEEE TMI 29(10), 1714–1729 (2010)
Rohlfing, T., Russakoff, D.B., Maurer, C.R.: Expectation maximization strategies for multi-atlas multi-label segmentation. In: Taylor, C.J., Noble, J.A. (eds.) IPMI 2003. LNCS, vol. 2732, pp. 210–221. Springer, Heidelberg (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Van Leemput, K., Sabuncu, M.R. (2014). A Cautionary Analysis of STAPLE Using Direct Inference of Segmentation Truth. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014. MICCAI 2014. Lecture Notes in Computer Science, vol 8673. Springer, Cham. https://doi.org/10.1007/978-3-319-10404-1_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-10404-1_50
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10403-4
Online ISBN: 978-3-319-10404-1
eBook Packages: Computer ScienceComputer Science (R0)