Skip to main content

Detecting Latin-Based Medical Terminology in Croatian Texts

  • Conference paper
  • First Online:
Formalizing Natural Languages with NooJ 2018 and Its Natural Language Processing Applications (NooJ 2018)

Abstract

No matter what the main language of texts in the medical domain is, there is always an evidence of the usage of Latin-derived words and formative elements in terminology development. Generally speaking, this usage presents language-specific morpho-semantic behaviors in forming both technical-scientific and common-usage words. Nevertheless, this usage of Latin in Croatian medical texts does not seem consistent due to the fact that different mechanisms of word formation may be applied to the same term. In our pursuit to map all the different occurrences of the same concept to only one, we propose a model designed within NooJ and based on dictionaries and morphological grammars. Starting from the manual detection of nouns and their variations, we recognize some word formation mechanisms and develop grammars suitable to recognize Latinisms and Croatinized Latin medical terminology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schneier, B.: The Hidden Battles to Collect your Data and Control your World. Data and Goliath, London (2015)

    Google Scholar 

  2. Davenport, T.: Big Data at Work: Dispelling the Myths, Uncovering the Opportunities. Harvard Business Review Press, Boston (2014)

    Book  Google Scholar 

  3. Simon, P.: Too Big to Ignore: The Business Case for Big Data, vol. 72. Wiley, Hoboken (2013)

    Google Scholar 

  4. Liu, H., Christiansen, T., Baumgartner, W.A., Verspoor, K.: Biolemmatizer: a lemmatization tool for morphological processing of biomedical text. J. Biomed. Semant. 3(1), 3 (2012)

    Article  Google Scholar 

  5. di Buono, M.P., Maisto, A., Pelosi, S.: From linguistic resources to medical entity recognition: a supervised morphosyntactic approach. ALLDATA 2015, 82 (2015)

    Google Scholar 

  6. Poljak, Ž.: Quo vadis, Croatian medical terminology-should the diagnoses be written in Croatian, Latin or English? Acta Clinica Croatica 46(1–Supplement 1), 121–126 (2007)

    Google Scholar 

  7. Gjuran-Coha, A., Bosnar-Valković, B.: Lingvistička analiza medicinskoga diskursa. JAHR 4(7), 107–128 (2013)

    Google Scholar 

  8. Estopa, R., Vivaldi, J., Cabre, M.T.: Use of Greek and Latin forms for term detection. In: LREC (2000)

    Google Scholar 

  9. Herrero-Zorita, C., Moreno-Sandoval, A.: Medical term formation in English and Japanese. Rev. Cogn. Linguist. 13(1), 81–105 (2015). Published under the auspices of the Spanish Cognitive Linguistics Association

    Article  Google Scholar 

  10. Smith, G.L., Davis, P.E., Soltesz, S.E.: Quick Medical Terminology. In: Smith, G.L., Davis, P.E. (eds.) Consultation with Shirley Soltesz, E. Wiley, Hoboken (1972)

    Google Scholar 

  11. Piñero, J.M.L., Terrada, M.L.: Introducción a la terminología médica. Elsevier, España (2005)

    Google Scholar 

  12. Abacha, A.B., Zweigenbaum, P.: Medical entity recognition: a comparison of semantic and statistical methods. In: Proceedings of BioNLP 2011 Workshop, pp. 56–64. Association for Computational Linguistics (2011)

    Google Scholar 

  13. Pacak, M., Pratt, A.: Identification and transformation of terminal morphemes in medical English Part II. Methods Inf. Med. 17(02), 95–100 (1978)

    Article  Google Scholar 

  14. Wolff, S.: The use of morphosemantic regularities in the medical vocabulary for automatic lexical coding. Methods Inf. Med. 23(04), 195–203 (1984)

    Article  Google Scholar 

  15. Pacak, M.G., Norton, L., Dunham, G.S.: Morphosemantic analysis of-itis forms in medical language. Methods Inf. Med. 19(02), 99–105 (1980)

    Article  Google Scholar 

  16. Norton, L., Pacak, M.G.: Morphosemantic analysis of compound word forms denoting surgical procedures. Methods Inf. Med. 22(01), 29–36 (1983)

    Article  Google Scholar 

  17. Dujols, P., Aubas, P., Baylon, C., Grémy, F.: Morpho-semantic analysis and translation of medical compound terms. Methods Inf. Med. 30(1), 30–35 (1991)

    Article  Google Scholar 

  18. Aronson, A.R.: Effective mapping of biomedical text to the UMLS metathesaurus: the metamap program. In: Proceedings of the AMIA Symposium, p. 17. American Medical Informatics Association (2001)

    Google Scholar 

  19. Hahn, U., Romacker, M., Schulz, S.: Medsyndikate-a natural language system for the extraction of medical information from findings reports. Int. J. Med. Inform. 67(1–3), 63–74 (2002)

    Article  Google Scholar 

  20. Isozaki, H., Kazawa, H.: Efficient support vector classifiers for named entity recognition. In: Proceedings of the 19th international conference on Computational linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)

    Google Scholar 

  21. He, Y., Kayaalp, M.: Biological entity recognition with conditional random fields. In: AMIA Annual Symposium Proceedings, vol. 2008, p. 293. American Medical Informatics Association (2008)

    Google Scholar 

  22. Finkel, J., Dingare, S., Nguyen, H., Nissim, M., Manning, C., Sinclair, G.: Exploiting context for biomedical entity recognition: from syntax to the web. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, pp. 88–91. Association for Computational Linguistics (2004)

    Google Scholar 

  23. de la Villa, M., Aparicio, F., Maña, M.J., de Buenaga, M.: A learning support tool with clinical cases based on concept maps and medical entity recognition. In: Proceedings of the 2012 ACM international conference on Intelligent User Interfaces, pp. 61–70. ACM (2012)

    Google Scholar 

  24. Khoo, C.S., Chan, S., Niu, Y.: Extracting causal knowledge from a medical database using graphical patterns. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pp. 336–343. Association for Computational Linguistics (2000)

    Google Scholar 

  25. Skeppstedt, M., Kvist, M., Dalianis, H.: Rule-based entity recognition and coverage of snomed ct in swedish clinical text. In: LREC, pp. 1250–1257 (2012)

    Google Scholar 

  26. Proux, D., Rechenmann, F., Julliard, L., Pillet, V., Jacq, B.: Detecting gene symbols and names in biological texts. Genome Inform. 9, 72–80 (1998)

    Google Scholar 

  27. Liang, T., Shih, P.-K.: Empirical textual mining to protein entities recognition from PubMed corpus. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 56–66. Springer, Heidelberg (2005). https://doi.org/10.1007/11428817_6

    Chapter  Google Scholar 

  28. Roberts, A., Gaizauskas, R.J., Hepple, M., Guo, Y.: Combining terminology resources and statistical methods for entity recognition: an evaluation. In: LREC (2008)

    Google Scholar 

  29. Silberztein, M.: Formalizing Natural Languages: The NooJ Approach. Wiley, London (2016)

    Book  Google Scholar 

Download references

Acknowledgement

This research has been partly supported by the European Regional Development Fund under the grant KK.01.1.1.01.0009 (DATACROSS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kristina Kocijan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kocijan, K., di Buono, M.P., Mijić, L. (2019). Detecting Latin-Based Medical Terminology in Croatian Texts. In: Mirto, I., Monteleone, M., Silberztein, M. (eds) Formalizing Natural Languages with NooJ 2018 and Its Natural Language Processing Applications. NooJ 2018. Communications in Computer and Information Science, vol 987. Springer, Cham. https://doi.org/10.1007/978-3-030-10868-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-10868-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-10867-0

  • Online ISBN: 978-3-030-10868-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics