Lung Cancer Concept Annotation from Spanish Clinical Narratives

  • Marjan Najafabadipour
  • Juan Manuel Tuñas
  • Alejandro Rodríguez-GonzálezEmail author
  • Ernestina Menasalvas
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11371)


Recent rapid increase in the generation of clinical data and rapid development of computational science make us able to extract new insights from massive datasets in healthcare industry. Oncological Electronic Health Records (EHRs) are creating rich databases for documenting patient’s history and they potentially contain a lot of patterns that can help in better management of the disease. However, these patterns are locked within free text (unstructured) portions of EHRs and consequence in limiting health professionals to extract useful information from them and to finally perform Query and Answering (Q&A) process in an accurate way. The Information Extraction (IE) process requires Natural Language Processing (NLP) techniques to assign semantics to these patterns. Therefore, in this paper, we analyze the design of annotators for specific lung cancer concepts that can be integrated over Apache Unstructured Information Management Architecture (UIMA) framework. In addition, we explain the details of generation and storage of annotation outcomes.


Electronic health record Natural language processing Named entity recognition Lung cancer 


  1. 1.
    Cancer, World Health Organization. Accessed 12 July 2018
  2. 2.
    1 in 4 deaths caused by cancer in the EU28. Accessed 21 June 2018
  3. 3.
    Luengo-Fernandez, R., Leal, J., Gray, A., Sullivan, R.: Economic burden of cancer across the European Union: a population-based cost analysis. Lancet Oncol. 14(12), 1165–1174 (2013)CrossRefGoogle Scholar
  4. 4.
    Shlomi, D., et al.: Non-invasive early detection of malignant pulmonary nodules by FISH-based sputum test. Cancer Genet. 226–227, 1–10 (2018)CrossRefGoogle Scholar
  5. 5.
    Zaman, A., Bivona, T.G.: Emerging application of genomics-guided therapeutics in personalized lung cancer treatment. Ann. Transl. Med. 6(9), 160 (2018)CrossRefGoogle Scholar
  6. 6.
    Molecular profiling of lung cancer - my cancer genome. Accessed 21 June 2018
  7. 7.
    NCI Dictionary of Cancer Terms, National Cancer Institute. Accessed 21 June 2018
  8. 8.
    Ahmadzada, T., Kao, S., Reid, G., Boyer, M., Mahar, A., Cooper, W.: An update on predictive biomarkers for treatment selection in non-small cell lung cancer. J. Clin. Med. 7(6), 153 (2018)CrossRefGoogle Scholar
  9. 9.
    Oser, M.G., Niederst, M.J., Sequist, L.V., Engelman, J.A.: Transformation from non-small-cell lung cancer to small-cell lung cancer: molecular drivers and cells of origin. Lancet Oncol. 16(4), e165–e172 (2015)CrossRefGoogle Scholar
  10. 10.
    Iwahara, T., et al.: Molecular characterization of ALK, a receptor tyrosine kinase expressed specifically in the nervous system. Oncogene 14(4), 439–449 (1997)CrossRefGoogle Scholar
  11. 11.
    Rimkunas, V.M., et al.: Analysis of receptor tyrosine kinase ROS1-positive tumors in non-small cell lung cancer: identification of a FIG-ROS1 fusion. Clin. Cancer Res. 18(16), 4449–4457 (2012)CrossRefGoogle Scholar
  12. 12.
    AJCC - Implementation of AJCC 8th Edition Cancer Staging System. Accessed 14 Mar 2018
  13. 13.
    Detterbeck, F.C., Boffa, D.J., Kim, A.W., Tanoue, L.T.: The eighth edition lung cancer stage classification. Chest 151(1), 193–203 (2017)CrossRefGoogle Scholar
  14. 14.
    Mak, K.S., et al.: Defining a standard set of patient-centred outcomes for lung cancer. Eur. Respir. J. 48(3), 852–860 (2016)CrossRefGoogle Scholar
  15. 15.
    Performance scales: Karnofsky & ECOG scores practice tools| OncologyPRO. Accessed 12 July 2018
  16. 16.
    Oken, M.M., et al.: Toxicity and response criteria of the Eastern Cooperative Oncology Group. Am. J. Clin. Oncol. 5(6), 649–655 (1982)CrossRefGoogle Scholar
  17. 17.
    Hanauer, D.A., Mei, Q., Law, J., Khanna, R., Zheng, K.: Supporting information retrieval from electronic health records: a report of University of Michigan’s nine-year experience in developing and using the Electronic Medical Record Search Engine (EMERSE). J. Biomed. Inform. 55, 290–300 (2015)CrossRefGoogle Scholar
  18. 18.
    Wang, Y., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inform. 77, 34–49 (2018)CrossRefGoogle Scholar
  19. 19.
    SNOMED International. Accessed 13 July 2018
  20. 20.
    Unified Medical Language System (UMLS). Accessed 04 May 2018
  21. 21.
    Savova, G.K., et al.: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010)CrossRefGoogle Scholar
  22. 22.
    Friedman, C., Hripcsak, G., DuMouchel, W., Johnson, S.B., Clayton, P.D.: Natural language processing in an operational clinical information system. Nat. Lang. Eng. 1(1), 83–108 (1995)CrossRefGoogle Scholar
  23. 23.
    Coden, A., et al.: Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model. J. Biomed. Inform. 42(5), 937–949 (2009)CrossRefGoogle Scholar
  24. 24.
    Zeng, Q.T., Goryachev, S., Weiss, S., Sordo, M., Murphy, S.N., Lazarus, R.: Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med. Inform. Decis. Mak. 6, 30 (2006)CrossRefGoogle Scholar
  25. 25.
    Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium, pp. 17–21 (2001)Google Scholar
  26. 26.
    de la Concha, V.G., et al.: EL ESPAÑOL: UNA LENGUA VIVAGoogle Scholar
  27. 27.
    Menasalvas Ruiz, E., et al.: Profiling lung cancer patients using electronic health records. J. Med. Syst. 42(7), 126 (2018)CrossRefGoogle Scholar
  28. 28.
    Menasalvas, E., Rodriguez-Gonzalez, A., Costumero, R., Ambit, H., Gonzalo, C.: Clinical narrative analytics challenges. In: Flores, V., et al. (eds.) IJCRS 2016. LNCS (LNAI), vol. 9920, pp. 23–32. Springer, Cham (2016). Scholar
  29. 29.
    Detterbeck, F.C.: The eighth edition TNM stage classification for lung cancer: what does it mean on main street? J. Thorac. Cardiovasc. Surg. 155(1), 356–359 (2018)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Centro de Tecnología BiomédicaUniversidad Politécnica de MadridMadridSpain
  2. 2.ETS de Ingenieros Informáticos, Universidad Politécnica de MadridMadridSpain

Personalised recommendations