Skip to main content

Experiences from the ImageCLEF Medical Retrieval and Annotation Tasks

  • Chapter
  • First Online:
Book cover Information Retrieval Evaluation in a Changing World

Abstract

The medical tasks in ImageCLEF have been run every year from 2004–2018 and many different tasks and data sets have been used over these years. The created resources are being used by many researchers well beyond the actual evaluation campaigns and are allowing to compare the performance of many techniques on the same grounds and in a reproducible way. Many of the larger data sets are from the medical literature, as such images are easier to obtain and to share than clinical data, which was used in a few smaller ImageCLEF challenges that are specifically marked with the disease type and anatomic region. This chapter describes the main results of the various tasks over the years, including data, participants, types of tasks evaluated and also the lessons learned in organizing such tasks for the scientific community.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Angelini M, Ferro N, Larsen B, Müller H, Santucci G, Silvello G, Tsikrika T (2014) Measuring and analyzing the scholarly impact of experimental evaluation initiatives. In: Italian research conference on digital libraries

    Google Scholar 

  • Buyya R, Venugopal S (2005) A gentle introduction to grid computing and technologies. CSI Commun 29(1):9–19

    Google Scholar 

  • Cleverdon C, Mills J, Keen M (1966) Factors determining the performance of indexing systems. Tech. Rep., ASLIB Cranfield Research Project, Cranfield

    Google Scholar 

  • Clinchant S, Csurka G, Ah-Pine J, Jacquet G, Perronnin F, Sánchez J, Minoukadeh K (2010) Xrce’s participation in wikipedia retrieval, medical image modality classification and ad–hoc retrieval tasks of imageclef 2010. In: Working notes of the 2010 CLEF workshop

    Google Scholar 

  • Clough P, Sanderson M (2004) The CLEF 2003 cross language image retrieval task. In: Proceedings of the cross language evaluation forum (CLEF 2003)

    Chapter  Google Scholar 

  • Clough P, Sanderson M (2013) Evaluating the performance of information retrieval systems using test collections. Information Research 18(2). http://informationr.net/ir/18-2/paper582.html#.XSXc5S2B06g

  • Clough P, Müller H, Sanderson M (2005) The CLEF 2004 cross–language image retrieval track. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: result of the fifth CLEF evaluation campaign. Lecture notes in computer science (LNCS), vol 3491, Springer, Bath, pp 597–613

    Chapter  Google Scholar 

  • Clough P, Müller H, Deselaers T, Grubinger M, Lehmann TM, Jensen J, Hersh W (2006) The CLEF 2005 cross–language image retrieval track. In: Cross language evaluation forum (CLEF 2005). Lecture notes in computer science (LNCS). Springer, Berlin, pp 535–557

    Chapter  Google Scholar 

  • Clough P, Müller H, Sanderson M (2010) Seven years of image retrieval evaluation. Springer, Berlin, pp 3–18

    Google Scholar 

  • Depeursinge A, Müller H (2010) Fusion techniques for combining textual and visual information retrieval. In: Müller H, Clough P, Deselaers T, Caputo B (eds) ImageCLEF, The Springer international series on information retrieval, vol 32. Springer, Berlin, pp 95–114

    Google Scholar 

  • Deselaers T, Weyand T, Keysers D, Macherey W, Ney H (2005) FIRE in ImageCLEF 2005: combining content–based image retrieval with textual information retrieval. In: Working notes of the CLEF workshop, Vienna

    Google Scholar 

  • Dicente Cid Y, Batmanghelich K, Müller H (2017a) Textured graph-model of the lungs for tuberculosis type classification and drug resistance prediction: participation in ImageCLEF 2017. In: CLEF2017 working notes. CEUR workshop proceedings, Dublin, CEUR-WS.org. http://ceur-ws.org

  • Dicente Cid Y, Kalinovsky A, Liauchuk V, Kovalev V, Müller H (2017b) Overview of ImageCLEFtuberculosis 2017 - predicting tuberculosis type and drug resistances. In: CLEF 2017 labs working notes. CEUR Workshop Proceedings. Dublin, CEUR-WS.org. http://ceur-ws.org

  • Eickhoff C, Schwall I, García Seco de Herrera A, Müller H (2017) Overview of ImageCLEFcaption 2017 - the image caption prediction and concept extraction tasks to understand biomedical images. In: CLEF2017 working notes. CEUR workshop proceedings. Dublin, CEUR-WS.org. http://ceur-ws.org

  • Foncubierta-Rodríguez A, Müller H (2012) Ground truth generation in medical imaging: a crowdsourcing based iterative approach. In: Workshop on crowdsourcing for multimedia

    Google Scholar 

  • García Seco de Herrera A, Kalpathy-Cramer J, Demner Fushman D, Antani S, Müller H (2013) Overview of the ImageCLEF 2013 medical tasks. In: Working notes of CLEF 2013 (cross language evaluation forum)

    Google Scholar 

  • García Seco de Herrera A, Foncubierta-Rodríguez A, Markonis D, Schaer R, Müller H (2014) Crowdsourcing for medical image classification. In: Annual congress SGMI 2014

    Google Scholar 

  • García Seco de Herrera A, Müller H, Bromuri S (2015) Overview of the ImageCLEF 2015 medical classification task. In: Working notes of CLEF 2015 (cross language evaluation forum)

    Google Scholar 

  • García Seco de Herrera A, Schaer R, Bromuri S, Müller H (2016a) Overview of the ImageCLEF 2016 medical task. In: Working notes of CLEF 2016 (cross language evaluation forum)

    Google Scholar 

  • García Seco de Herrera A, Schaer R, Antani S, Müller H (2016b) Using crowdsourcing for multi-label biomedical compound figure annotation. In: MICCAI workshop Labels. Lecture notes in computer science. Springer, Berlin

    Google Scholar 

  • Gollub T, Stein B, Burrows S, Hoppe D (2012) Tira: configuring, executing, and disseminating information retrieval experiments. In: 2012 23rd international workshop on database and expert systems applications (DEXA). IEEE, Piscataway, pp 151–155

    Chapter  Google Scholar 

  • Hanbury A, Müller H, Langs G, Weber MA, Menze BH, Fernandez TS (2012) Bringing the algorithms to the data: cloud–based benchmarking for medical image analysis. In: CLEF conference. Lecture notes in computer science. Springer, Berlin

    Chapter  Google Scholar 

  • Hanbury A, Müller H, Balog K, Brodt T, Cormack GV, Eggel I, Gollub T, Hopfgartner F, Kalpathy-Cramer J, Kando N, Krithara A, Lin J, Mercer S, Potthast M (2015) Evaluation–as–a–service: overview and outlook. ArXiv 1512.07454

    Google Scholar 

  • Heimann T, Van Ginneken B, Styner M, Arzhaeva Y, Aurich V, Bauer C, Beck A, Becker C, Beichel R, Bekes G, et al (2009) Comparison and evaluation of methods for liver segmentation from CT datasets. IEEE Trans Med Imag 28(8):1251–1265

    Article  Google Scholar 

  • Jimenez-del-Toro O, Hanbury A, Langs G, Foncubierta-Rodríguez A, Müller H (2015) Overview of the VISCERAL retrieval benchmark 2015. In: Multimodal retrieval in the medical domain: first international workshop, MRMD 2015, Vienna, Austria, March 29, 2015, Revised selected papers. Lecture notes in computer science, vol 9059. Springer, Berlin, pp 115–123

    Google Scholar 

  • Jimenez-del-Toro O, Müller H, Krenn M, Gruenberg K, Taha AA, Winterstein M, Eggel I, Foncubierta-Rodríguez A, Goksel O, Jakab A, Kontokotsios G, Langs G, Menze B, Salas Fernandez T, Schaer R, Walleyo A, Weber MA, Dicente Cid Y, Gass T, Heinrich M, Jia F, Kahl F, Kechichian R, Mai D, Spanier AB, Vincent G, Wang C, Wyeth D, Hanbury A (2016) Cloud–based evaluation of anatomical structure segmentation and landmark detection algorithms: VISCERAL anatomy benchmarks. IEEE Trans Med Imag 35(11):2459–2475

    Article  Google Scholar 

  • Jones KS, van Rijsbergen C (1975) Report on the need for and provision of an ideal information retrieval test collection. British Library Research and Development Report 5266, Computer Laboratory, University of Cambridge

    Google Scholar 

  • Kalpathy-Cramer J, Müller H, Bedrick S, Eggel I, García Seco de Herrera A, Tsikrika T (2011) The CLEF 2011 medical image retrieval and classification tasks. In: Working notes of CLEF 2011 (cross language evaluation forum)

    Google Scholar 

  • Kalpathy-Cramer J, García Seco de Herrera A, Demner-Fushman D, Antani S, Bedrick S, Müller H (2015) Evaluating performance of biomedical image retrieval systems: overview of the medical image retrieval task at ImageCLEF 2004–2014. Comput Med Imag Graph 39:55–61

    Article  Google Scholar 

  • Koitka S, Friedrich CM (2016) Traditional feature engineering and deep learning approaches at medical classification task of ImageCLEF 2016. In: CLEF2016 working notes. CEUR workshop proceedings. CEUR-WS.org, Évora

    Google Scholar 

  • Krenn M, Dorfer M, Jimenez-del-Toro O, Müller H, Menze B, Weber MA, Hanbury A, Langs G (2016) Creating a large–scale silver corpus from multiple algorithmic segmentations. Springer, Berlin, pp 103–115

    Google Scholar 

  • Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, Wiest R, Lanczi L, Gerstner E, Weber MA, Arbel T, Avants BB, Ayache N, Buendia P, Collins DL, Cordier N, Corso JJ, Criminisi A, Das T, Delingette H, Demiralp C, Durst CR, Dojat M, Doyle S, Festa J, Forbes F, Geremia E, Glocker B, Golland P, Guo X, Hamamci A, Iftekharuddin KM, Jena R, John NM, Konukoglu E, Lashkari D, Mariz JA, Meier R, Pereira S, Precup D, Price SJ, Raviv TR, Reza SMS, Ryan M, Sarikaya D, Schwartz L, Shin HC, Shotton J, Silva CA, Sousa N, Subbanna NK, Szekely G, Taylor TJ, Thomas OM, Tustison NJ, Unal G, Vasseur F, Wintermark M, Ye DH, Zhao L, Zhao B, Zikic D, Prastawa M, Reyes M, Van Leemput K (2015) The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imag 34(10):1993–2024

    Article  Google Scholar 

  • Müller H, Geissbuhler A, Ruch P (2005) ImageCLEF 2004: combining image and multi–lingual search for medical image retrieval. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: result of the fifth CLEF evaluation campaign. Lecture notes in computer science (LNCS), vol 3491. Springer, Bath, pp 718–727

    Chapter  Google Scholar 

  • Müller H, Deselaers T, Lehmann T, Clough P, Kim E, Hersh W (2006) Overview of the ImageCLEFmed 2006 medical retrieval and annotation tasks. In: Nardi A, Peters C, Vicedo JL, Ferro N (eds) CLEF 2006 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1172/

  • Müller H, Boyer C, Gaudinat A, Hersh W, Geissbuhler A (2007a) Analyzing web log files of the health on the net HONmedia search engine to define typical image search tasks for image retrieval evaluation. In: MedInfo 2007. Studies in health technology and informatics, Brisbane, vol 12. IOS Press, Amsterdam, pp 1319–1323

    Google Scholar 

  • Müller H, Deselaers T, Grubinger M, Clough P, Hanbury A, Hersh W (2007b) Problems with running a successful multimedia retrieval benchmark. In: MUSCLE/ImageCLEF workshop 2007, Budapest, pp 9–18

    Google Scholar 

  • Müller H, Deselaers T, Kim E, Kalpathy-Cramer J, Deserno TM, Clough P, Hersh W (2008a) Overview of the ImageCLEFmed 2007 medical retrieval and annotation tasks. In: CLEF 2007 proceedings. Lecture notes in computer science (LNCS), Springer, Budapest, vol 5152, pp 473–491

    Google Scholar 

  • Müller H, Kalpathy-Cramer J, Hersh W, Geissbuhler A (2008b) Using medline queries to generate image retrieval tasks for benchmarking. In: Medical informatics Europe (MIE2008). IOS Press, Gothenburg, pp 523–528

    Google Scholar 

  • Müller H, Kalpathy-Cramer J, Kahn CE, Jr, Hatt W, Bedrick S, Hersh W (2009) Overview of the ImageCLEFmed 2008 medical image retrieval task. In: Peters C, Giampiccolo D, Ferro N, Petras V, Gonzalo J, Peñas A, Deselaers T, Mandl T, Jones G, Kurimo M (eds) Evaluating systems for multilingual and multimodal information access – 9th workshop of the cross-language evaluation forum, Aarhus, Denmark. Lecture Notes in Computer Science (LNCS), vol 5706, pp 500–510

    Google Scholar 

  • Müller H, Clough P, Deselaers T, Caputo B (eds) (2010a) ImageCLEF – experimental evaluation in visual information retrieval. The Springer international series on information retrieval, vol 32. Springer, Berlin

    MATH  Google Scholar 

  • Müller H, Kalpathy-Cramer J, Eggel I, Bedrick S, Reisetter J, Kahn CE Jr, Hersh W (2010b) Overview of the CLEF 2010 medical image retrieval track. In: Working notes of CLEF 2010 (Cross language evaluation forum)

    Google Scholar 

  • Müller H, García Seco de Herrera A, Kalpathy-Cramer J, Demner Fushman D, Antani S, Eggel I (2012) Overview of the ImageCLEF 2012 medical image retrieval and classification tasks. In: Working notes of CLEF 2012 (Cross language evaluation forum)

    Google Scholar 

  • Radhouani S, Kalpathy-Cramer J, Bedrick S, Bakke B, Hersh W (2009) Multimodal medical image retrieval improving precision at ImageCLEF 2009. In: Working notes of the 2009 CLEF workshop, Corfu

    Google Scholar 

  • Rowe BR, Wood DW, Link AN, Simoni DA (2010) Economic impact assessment of NIST text retrieval conference (TREC) program. Technical report project number 0211875, National Institute of Standards and Technology

    Google Scholar 

  • Stefan LD, Ionescu B, Müller H (2017) Generating captions for medical images with a deep learning multi-hypothesis approach: ImageCLEF 2017 caption task. In: CLEF2017 working notes, CEUR Workshop Proceedings. Dublin, CEUR-WS.org. http://ceur-ws.org

  • Thornley CV, Johnson AC, Smeaton AF, Lee H (2011) The scholarly impact of TRECVid (2003–2009). J Am Soc Inf Sci Technol 62(4):613–627

    Article  Google Scholar 

  • Tommasi T, Caputo B, Welter P, Güld M, Deserno TM (2010) Overview of the CLEF 2009 medical image annotation track. In: Peters C, Caputo B, Gonzalo J, Jones G, Kalpathy-Cramer J, Müller H, Tsikrika T (eds) Multilingual information access evaluation II. Multimedia experiments. Lecture notes in computer science, vol 6242. Springer, Berlin, pp 85–93

    Chapter  Google Scholar 

  • Tsikrika T, García Seco de Herrera A, Müller H (2011) Assessing the scholarly impact of ImageCLEF. In: CLEF 2011. Springer lecture notes in computer science (LNCS), pp 95–106

    Chapter  Google Scholar 

  • Tsikrika T, Larsen B, Müller H, Endrullis S, Rahm E (2013) The scholarly impact of CLEF (2000–2009). In: Information access evaluation. Multilinguality, multimodality, and visualization. Springer, Berlin, pp 1–12

    Google Scholar 

Download references

Acknowledgements

We would like to thank the various funding organizations that have helped make ImageCLEF a reality (EU FP6 & FP7, SNF, RCSO, Google and others) and also all the volunteer researchers who helped organize the tasks. Another big thank you goes to the data providers that assured that medical data could be shared with the participants. A final thanks to all participants who work on the tasks and provide us with techniques to compare and with lively discussions at the ImageCLEF workshops.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henning Müller .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Müller, H., Kalpathy-Cramer, J., García Seco de Herrera, A. (2019). Experiences from the ImageCLEF Medical Retrieval and Annotation Tasks. In: Ferro, N., Peters, C. (eds) Information Retrieval Evaluation in a Changing World. The Information Retrieval Series, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-030-22948-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-22948-1_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-22947-4

  • Online ISBN: 978-3-030-22948-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics