Overview of the ShARe/CLEF eHealth Evaluation Lab 2013

Suominen, Hanna; Salanterä, Sanna; Velupillai, Sumithra; Chapman, Wendy W.; Savova, Guergana; Elhadad, Noemie; Pradhan, Sameer; South, Brett R.; Mowery, Danielle L.; Jones, Gareth J. F.; Leveling, Johannes; Kelly, Liadh; Goeuriot, Lorraine; Martinez, David; Zuccon, Guido

doi:10.1007/978-3-642-40802-1_24

Overview of the ShARe/CLEF eHealth Evaluation Lab 2013

Hanna Suominen²¹,
Sanna Salanterä²²,
Sumithra Velupillai²³,
Wendy W. Chapman²⁴,
Guergana Savova²⁵,
Noemie Elhadad²⁶,
Sameer Pradhan²⁵,
Brett R. South²⁷,
Danielle L. Mowery²⁸,
Gareth J. F. Jones²⁹,
Johannes Leveling²⁹,
Liadh Kelly²⁹,
Lorraine Goeuriot²⁹,
David Martinez³⁰ &
…
Guido Zuccon³¹

Conference paper

2215 Accesses
67 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8138))

Abstract

Discharge summaries and other free-text reports in healthcare transfer information between working shifts and geographic locations. Patients are likely to have difficulties in understanding their content, because of their medical jargon, non-standard abbreviations, and ward-specific idioms. This paper reports on an evaluation lab with an aim to support the continuum of care by developing methods and resources that make clinical reports in English easier to understand for patients, and which helps them in finding information related to their condition. This ShARe/CLEFeHealth2013 lab offered student mentoring and shared tasks: identification and normalisation of disorders (1a and 1b) and normalisation of abbreviations and acronyms (2) in clinical reports with respect to terminology standards in healthcare as well as information retrieval (3) to address questions patients may have when reading clinical reports. The focus on patients’ information needs as opposed to the specialised information needs of physicians and other healthcare workers was the main feature of the lab distinguishing it from previous shared tasks. De-identified clinical reports for the three tasks were from US intensive care and originated from the MIMIC II database. Other text documents for Task 3 were from the Internet and originated from the Khresmoi project. Task 1 annotations originated from the ShARe annotations. For Tasks 2 and 3, new annotations, queries, and relevance assessments were created. 64, 56, and 55 people registered their interest in Tasks 1, 2, and 3, respectively. 34 unique teams (3 members per team on average) participated with 22, 17, 5, and 9 teams in Tasks 1a, 1b, 2 and 3, respectively. The teams were from Australia, China, France, India, Ireland, Republic of Korea, Spain, UK, and USA. Some teams developed and used additional annotations, but this strategy contributed to the system performance only in Task 2. The best systems had the F1 score of 0.75 in Task 1a; Accuracies of 0.59 and 0.72 in Tasks 1b and 2; and Precision at 10 of 0.52 in Task 3. The results demonstrate the substantial community interest and capabilities of these systems in making clinical reports easier to understand for patients. The organisers have made data and tools available for future research and development.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allvin, H., Carlsson, E., Dalianis, H., Danielsson-Ojala, R., Daudaravicius, V., Hassel, M., Kokkinakis, D., Lundgren-Laine, H., Nilsson, G., Nytro, O., Salanterä, S., Skeppstedt, M., Suominen, H., Velupillai, S.: Characteristics of Finnish and Swedish intensive care nursing narratives: A comparative analysis to support the development of clinical language technologies. Journal of Biomedical Semantics 2(suppl. 3), S1 (2011)
Article Google Scholar
Suominen, H. (ed.): The Proceedings of the CLEFeHealth2012 — the CLEF 2012 Workshop on Cross-Language Evaluation of Methods, Applications, and Resources for eHealth Document Analysis. NICTA (2012)
Google Scholar
Fox, S.: Health Topics: 80% of internet users look for health information online. Technical report, Pew Research Center (February 2011)
Google Scholar
Kummervold, P., Chronaki, C., Lausen, B., Prokosch, H., Rasmussen, J., Santana, S., Staniszewski, A., Wangberg, S.: eHealth trends in Europe 2005–2007: A population-based survey. Journal of Medical Internet Research 10(4), e42 (2008)
Article Google Scholar
Experian Hitwise: Google Receives 87.81 Percent of Australian Searches in June 2008 (2008), http://www.hitwise.com/au/press-centre/press-releases/2008/ap-google-searches-for-june/
Pradhan, S., Elhadad, N., South, B., Martinez, D., Christensen, L., Vogel, A., Suominen, H., Chapman, W., Savova, G.: Task 1: ShARe/CLEF eHealth Evaluation Lab 2013. In: Online Working Notes of CLEF, CLEF (2013)
Google Scholar
Mowery, D., South, B., Christensen, L., Murtola, L., Salanterä, S., Suominen, H., Martinez, D., Elhadad, N., Pradhan, S., Savova, G., Chapman, W.: Task 2: ShARe/CLEF eHealth Evaluation Lab 2013. In: Online Working Notes of CLEF, CLEF (2013)
Google Scholar
Goeuriot, L., Jones, G., Kelly, L., Leveling, J., Hanbury, A., Müller, H., Salanterä, S., Suominen, H., Zuccon, G.: ShARe/CLEF eHealth Evaluation Lab 2013, Task 3: Information retrieval to address patients’ questions when reading clinical reports. In: Online Working Notes of CLEF, CLEF (2013)
Google Scholar
Becker, H.: Computerization of patho-histological findings in natural language. Pathologia Europaea 7(2), 193–200 (1972)
Google Scholar
Anderson, B., Bross, I., Sager, N.: Grammatical compression in notes and records: Analysis and computation. American Journal of Computational Linguistics 2(4), 68–82 (1975)
Google Scholar
Hirschman, L., Grishman, R., Sager, N.: From text to structured information: automatic processing of medical reports. In: American Federation of Information Processing Societies: 1976 National Computer Conference. AFIPS Conference Proceedings, vol. 45, pp. 267–275. Association for Computational Linguistics, New York (1976)
Google Scholar
Collen, M.: Patient data acquisition. Medical Instrumentation 12, 222–225 (1978)
Google Scholar
Sarkar, I.: Biomedical informatics and translational medicine. Journal of Translational Medicine 8, 22 (2010) (review)
Article Google Scholar
Demner-Fushman, D., Chapman, W., McDonald, C.: What can natural language processing do for clinical decision support? Journal of Biomedical Informatics 42(5), 760–772 (2009) (review)
Article Google Scholar
Meystre, S., Savova, G., Kipper-Schuler, K., Hurdle, J.: Extracting information from textual documents in the electronic health record: a review of recent research. Yearbook of Medical Informatics, 128–144 (2008) (review)
Google Scholar
Reiner, B., Knight, N., Siegel, E.: Radiology reporting, past, present, and future: the radiologist’s perspective. Journal of the American College of Radiology: JACR 4(5), 313–319 (2007) (review)
Article Google Scholar
Suominen, H., Lehtikunnas, T., Back, B., Karsten, H., Salakoski, T., Salanterä, S.: Applying language technology to nursing documents: pros and cons with a focus on ethics. International Journal of Medical Informatics 76(suppl. 2), S293–S301 (2007) (review)
Google Scholar
Zweigenbaum, P., Demner-Fushman, D., Yu, H., Cohen, K.: Frontiers of biomedical text mining: current progress. Briefings in Bioinformatics 8(5), 358–375 (2007) (review)
Article Google Scholar
Mendonça, E., Haas, J., Shagina, L., Larson, E., Friedman, C.: Extracting information on pneumonia in infants using natural language processing of radiology reports. Journal of Biomedical Informatics 38(4), 314–321 (2005)
Article Google Scholar
Pakhomov, S., Buntrock, J., Chute, C.: Automating the assignment of diagnosis codes to patient encounters using example based and machine learning techniques. Journal of the American Medical Informatics Association: JAMIA 13(5), 516–525 (2006)
Article Google Scholar
Chapman, W., Nadkarni, P., Hirschman, L., D’Avolio, L., Savova, G., Uzuner, Ö.: Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. Journal of the American Medical Informatics Association: JAMIA 18, 540–543 (2011) (editorial)
Article Google Scholar
Robertson, S., Hull, D.: The TREC-9 filtering track final report. In: NIST Special Publication 500-249: The 9th Text REtrieval Conference (TREC 9), pp. 25–40 (2000)
Google Scholar
Roberts, P.M., Cohen, A.M., Hersh, W.R.: Tasks, topics and relevance judging for the TREC genomics track: five years of experience evaluating biomedical text information retrieval systems. Information Retrieval 12, 81–97 (2009)
Article Google Scholar
Voorhees, E.M., Tong, R.M.: Overview of the TREC 2011 medical records track. In: Proceedings of TREC, NIST (2011)
Google Scholar
Kalpathy-Cramer, J., Müller, H., Bedrick, S., Eggel, I., de Herrera, A., Tsikrika, T.: The CLEF 2011 medical image retrieval and classification tasks. In: Working Notes of CLEF 2011 (Cross Language Evaluation Forum) (2011)
Google Scholar
Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds.): Experimental Evaluation in Visual Information Retrieval. The Information Retrieval Series, vol. 32. Springer (2010)
Google Scholar
Uzuner, Ö., South, B., Shen, S., DuVall, S.: 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association: JAMIA 18, 552–556 (2011)
Article Google Scholar
Pestian, J., Brew, C., Matykiewicz, P., Hovermale, D., Johnson, N., Cohen, K., Duch, W.: A shared task involving multi-label classification of clinical free text. In: BioNLP Workshop of the Association for Computational Linguistics, pp. 97–104. Association for Computational Linguistics (2007)
Google Scholar
Pestian, J., Matykiewicz, P., Linn-Gust, M., South, B., Uzuner, Ö., Wiebe, J., Cohen, K., Hurdle, J., Brew, C.: Sentiment analysis of suicide notes: A shared task. Biomedical Informatics Insights 5(suppl. 1), 3–16 (2012)
Article Google Scholar
Boyer, C., Gschwandtner, M., Hanbury, A., Kritz, M., Pletneva, N., Samwald, M., Vargas, A.: Use case definition including concrete data requirements (D8.2). public deliverable, Khresmoi EU project (2012)
Google Scholar
Hanbury, A., Müller, H.: Khresmoi – multimodal multilingual medical information search. In: MIE Village of the Future (2012)
Google Scholar
Bodenreider, O., McCray, A.: Exploring semantic groups through visual approaches. Journal of Biomedical Informatics 36, 414–432 (2003)
Article Google Scholar
South, B.R., Shen, S., Leng, J., Forbush, T.B., DuVall, S.L., Chapman, W.W.: A prototype tool set to support machine-assisted annotation. In: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, BioNLP 2012, pp. 130–139. Association for Computational Linguistics, Stroudsburg (2012)
Google Scholar
Goeuriot, L., Kelly, L., Jones, G., Zuccon, G., Suominen, H., Hanbury, A., Müller, H., Leveling, J.: Creation of a New Evaluation Benchmark for Information Retrieval Targeting Patient Information Needs. In: Song, R., Webber, W., Kando, N., Kishida, K. (eds.) Proceedings of the 5th International Workshop on Evaluating Information Access (EVIA), A Satellite Workshop of the NTCIR-10 Conference. National Institute of Informatics/Kijima Printing, Tokyo/Fukuoka (2013)
Google Scholar
Koopman, B., Zuccon, G.: Relevation! an open source system for information retrieval relevance assessment. arXiv preprint (2013)
Google Scholar
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)
Article MATH Google Scholar
Robertson, S.E., Jones, S.: Simple, proven approaches to text retrieval. Technical Report 356, University of Cambridge (1994)
Google Scholar
Yeh, A.: More accurate tests for the statistical significance of result differences. In: Proceedings of the 18th Conference on Computational Linguistics (COLING), Saarbrücken, Germany, pp. 947–953 (2000)
Google Scholar
Smucker, M., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM 2007), pp. 623–632. Association for Computing Machinery, New York (2007)
Google Scholar
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20(4), 422–446 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

NICTA and The Australian National University, ACT, Australia
Hanna Suominen
University of Turku, Finland
Sanna Salanterä
DSV Stockholm University, Sweden
Sumithra Velupillai
University of California, San Diego, CA, USA
Wendy W. Chapman
Harvard University, MA, USA
Guergana Savova & Sameer Pradhan
Columbia University, NY, USA
Noemie Elhadad
University of Utah, UT, USA
Brett R. South
University of Pittsburgh, PA, USA
Danielle L. Mowery
Dublin City University, Ireland
Gareth J. F. Jones, Johannes Leveling, Liadh Kelly & Lorraine Goeuriot
NICTA and The University of Melbourne, VIC, Australia
David Martinez
The Australian e-Health Research Centre, CSIRO, QLD, Australia
Guido Zuccon

Authors

Hanna Suominen
View author publications
You can also search for this author in PubMed Google Scholar
Sanna Salanterä
View author publications
You can also search for this author in PubMed Google Scholar
Sumithra Velupillai
View author publications
You can also search for this author in PubMed Google Scholar
Wendy W. Chapman
View author publications
You can also search for this author in PubMed Google Scholar
Guergana Savova
View author publications
You can also search for this author in PubMed Google Scholar
Noemie Elhadad
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Pradhan
View author publications
You can also search for this author in PubMed Google Scholar
Brett R. South
View author publications
You can also search for this author in PubMed Google Scholar
Danielle L. Mowery
View author publications
You can also search for this author in PubMed Google Scholar
Gareth J. F. Jones
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Leveling
View author publications
You can also search for this author in PubMed Google Scholar
Liadh Kelly
View author publications
You can also search for this author in PubMed Google Scholar
Lorraine Goeuriot
View author publications
You can also search for this author in PubMed Google Scholar
David Martinez
View author publications
You can also search for this author in PubMed Google Scholar
Guido Zuccon
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for the Evaluation of Language and Communication Technologies (CELCT), via alla Cascata 56/c, 38123, Povo, Italy
Pamela Forner
HES-SO Valais, University of Applied Sciences Western Switzerland, Technopôle 3, 3960, Sierre, Switzerland
Henning Müller
Departamento de Sistemas Informáticos y Computación, Universitat Politècnica de València, Camino de Vera s/n, 46071, València, Spain
Roberto Paredes
Departamento de Sistemas Informáticos y Computación, Universitat Politècnica de València, Camino de Vera s/n, 46022, València, Spain
Paolo Rosso
Bauhaus-Universität Weimar, Bauhausstraße 11, 99423, Weimar, Germany
Benno Stein

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Suominen, H. et al. (2013). Overview of the ShARe/CLEF eHealth Evaluation Lab 2013. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visualization. CLEF 2013. Lecture Notes in Computer Science, vol 8138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40802-1_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-40802-1_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40801-4
Online ISBN: 978-3-642-40802-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics