Summary
Every year, we see the publication of new algorithms for medical image analysis including segmentation, registration, classification and retrieval in the literature. However, in order to be able to translate these advances into clinical practice, the relative effectiveness of these algorithms needs to be evaluated.
In this chapter, we begin with a motivation for systematic evaluations in science and more specifically in medical image analysis. We review the components of successful evaluation campaigns including realistic data sets and tasks, the gold standards used to compare systems, the choice of performance measures and finally workshops where participants share their experiences with the tasks and explain the various approaches. We also describe some of the popular efforts that have been conducted to evaluate retrieval, classification, segmentation and registration techniques. We describe the challenges in organizing such campaigns including the acquisition of databases of images of sufficient size and quality, establishment of sound metrics and ground truth, management of manpower and resources, motivation of participants, and the maintenance of a friendly level of competitiveness among participants. We conclude with lessons learned over the years of organizing campaigns, including successes and road-blocks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
H. Müller, W. Müller, D.M. Squire, Pattern Recognit. Lett. 22(5), 593 (2001)
K. Price, Comput. Vis. Graph. Image Process. 36(2–3), 387 (1986)
T. Heimann, M. Styner, B. Ginneken, MIDAS J. (2007)
H. Müller, J. Kalpathy-Cramer, I. Eggel, S. Bedrick, R. Said, B. Bakke, C.E. Kahn Jr., W. Hersh, in Working Notes of CLEF 2009 (Corfu, Greece, 2009)
M. Markkula, E. Sormunen, pp. 1–13
H. Müller, C. Despont-Gros, W. Hersh, J. Jensen, C. Lovis, A. Geissbuhler, in Proceedings of Medical Informatics Europe (MIE 2006), Maastricht, Netherlands (2006), pp. 24–32
W. Hersh, H. Müller, P. Gorman, J. Jensen, in Slice of Life conference on Multimedia in Medical Education (SOL 2005) (Portland, OR, USA, 2005)
T. Deselaers, T.M. Deserno, H. Müller, Pattern Recognit. Lett. 29(15), 1988 (2008)
C. Davatzikos, F. Xu, Y. An, Brain 132(8), 2026 (2009)
S.G. Armato, M.F. McNitt-Gray, A.P. Reeves, Acad. Radiol. 14(11), 1409 (2007)
S.G. Müller, M.W. Weiner, L.J. Thal, Alzheimers Dement. 1(1), 55 (2005)
G.D. Rubin, J.K. Lyo, D.S. Paik, Radiology 234(1), 274 (2005)
G. Christensen, X. Geng, J. Kuhl, T.J. Bruss, I.A. Grabowski, M.W. Pirwani, J.S. Vannier, Allen, H. Damasio, Lect Notes Computer Sci
S. Warfield, K. Zou, W. Wells, IEEE Trans. Med. Imaging 23(7), 903 (2004)
K.O. Babalola, B. Patenaude, P. Aljabar, J. Schnabel, D. Kennedy, W. Crum, S. Smith, T.F. Cootes, M. Jenkinson, D. Rueckert, Lect. Notes Comput. Sci. 5241, 409 (2008)
L.R. Dice, J. Ecol. 26, 297 (1945)
P. Jaccard, New Phytol. 11(2), 37 (1912)
A.P. Zijdenbos, B.M. Dawant, R.A. Margolin, Med. Imaging 13(4), 716 (1994)
G.W. Williams, Biometrics 32(4), 619 (1976)
M. Martin-Fernandez, Bouix, L. Ungar, R. McCarley, S. M, Lect. Notes Comput. Sci. 515–522 (2005)
W. Crum, O. Camara, D. Hill, IEEE Trans. Med. Imaging 25(11), 1451 (2006)
G. Gerig, M. Jomier, A. Chakos, Lect. Notes Comput. Sci. 2208, 516 (2001)
C.W. Cleverdon, Report on the testing and analysis of an investigation into the comparative efficiency of indexing systems. Tech. rep., Aslib Cranfield Research Project, Cranfield, USA (1962)
G. Salton, The SMART Retrieval System, Experiments in Automatic Document Processing (Prentice Hall, Englewood Cliffs, New Jersey, 1971)
E.M. Voorhees, D. Harmann, In: The Seventh Text Retrieval Conference Gaithersburg, MD, USA (1999), pp. 1–23
C.J. van Rijsbergen, Information Retrieval (Prentice Hall, Englewood Cliffs, New Jersey, 1979)
E.M. Voorhees, Inf Process Manage 36(5), 697 (2000)
M. Styner, J. Lee, B. Chin, MIDAS J. (2008)
Hameeteman, MIDAS J. 1–15 (2009)
T. Tommasi, B. Caputo, P. Welter, M.O. Güld, T.M. Deserno, in CLEF working notes
J. Savoy, Lect. Notes Comput. Sci. 2406, 27 (2002)
H. Müller, T. Deselaers, E. Kim, Lect. Notes Comput. Sci. 5152, 473 (2008)
H. Müller, N. Michoux, D. Bandon, Int. J. Med. Inform. 73(1), 1 (2004)
W. Hersh, Information Retrieval: A Health and Biomedical Perspective, 2nd edn. (Springer, Berlin, 2003)
P.G.B. Enser, J. Doc. 51(2), 126 (1995)
H. Müller, J. Kalpathy-Cramer, C.E. Kahn Jr., Lect. Notes Comput. Sci. 5706, 500 (2009)
J. Kalpathy-Cramer, S. Bedrick, W. Hatt, W. Hersh, in Working Notes of the 2008 CLEF Workshop (Aarhus, Denmark, 2008)
W. Hersh, J. Jensen, H. Müller, in ImageCLEF/MUSCLE workshop on image retrieval evaluation (2005), pp. 11–16
H. Müller, C. Boyer, A. Gaudinat, Stud. Health Techn. Inform. 12, 1319 (2007)
H. Müller, J. Kalpathy-Cramer, W. Hersh, A. Geissbuhler, in Medical Informatics Europe (MIE2008) (IOS press, Gothenburg, Sweden, 2008), pp. 523–528
A. Aamodt, E. Plaza, Artiff. Intell. Commun. 7(1), 39 (1994)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kalpathy-Cramer, J., Müller, H. (2010). Systematic Evaluations and Ground Truth. In: Deserno, T. (eds) Biomedical Image Processing. Biological and Medical Physics, Biomedical Engineering. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15816-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-15816-2_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15815-5
Online ISBN: 978-3-642-15816-2
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)