Detecting Latin-Based Medical Terminology in Croatian Texts

Kocijan, Kristina; di Buono, Maria Pia; Mijić, Linda

doi:10.1007/978-3-030-10868-7_4

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 987))

Included in the following conference series:

International Conference on Automatic Processing of Natural-Language Electronic Texts with NooJ

301 Accesses
3 Citations

Abstract

No matter what the main language of texts in the medical domain is, there is always an evidence of the usage of Latin-derived words and formative elements in terminology development. Generally speaking, this usage presents language-specific morpho-semantic behaviors in forming both technical-scientific and common-usage words. Nevertheless, this usage of Latin in Croatian medical texts does not seem consistent due to the fact that different mechanisms of word formation may be applied to the same term. In our pursuit to map all the different occurrences of the same concept to only one, we propose a model designed within NooJ and based on dictionaries and morphological grammars. Starting from the manual detection of nouns and their variations, we recognize some word formation mechanisms and develop grammars suitable to recognize Latinisms and Croatinized Latin medical terminology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Schneier, B.: The Hidden Battles to Collect your Data and Control your World. Data and Goliath, London (2015)
Google Scholar
Davenport, T.: Big Data at Work: Dispelling the Myths, Uncovering the Opportunities. Harvard Business Review Press, Boston (2014)
Book Google Scholar
Simon, P.: Too Big to Ignore: The Business Case for Big Data, vol. 72. Wiley, Hoboken (2013)
Google Scholar
Liu, H., Christiansen, T., Baumgartner, W.A., Verspoor, K.: Biolemmatizer: a lemmatization tool for morphological processing of biomedical text. J. Biomed. Semant. 3(1), 3 (2012)
Article Google Scholar
di Buono, M.P., Maisto, A., Pelosi, S.: From linguistic resources to medical entity recognition: a supervised morphosyntactic approach. ALLDATA 2015, 82 (2015)
Google Scholar
Poljak, Ž.: Quo vadis, Croatian medical terminology-should the diagnoses be written in Croatian, Latin or English? Acta Clinica Croatica 46(1–Supplement 1), 121–126 (2007)
Google Scholar
Gjuran-Coha, A., Bosnar-Valković, B.: Lingvistička analiza medicinskoga diskursa. JAHR 4(7), 107–128 (2013)
Google Scholar
Estopa, R., Vivaldi, J., Cabre, M.T.: Use of Greek and Latin forms for term detection. In: LREC (2000)
Google Scholar
Herrero-Zorita, C., Moreno-Sandoval, A.: Medical term formation in English and Japanese. Rev. Cogn. Linguist. 13(1), 81–105 (2015). Published under the auspices of the Spanish Cognitive Linguistics Association
Article Google Scholar
Smith, G.L., Davis, P.E., Soltesz, S.E.: Quick Medical Terminology. In: Smith, G.L., Davis, P.E. (eds.) Consultation with Shirley Soltesz, E. Wiley, Hoboken (1972)
Google Scholar
Piñero, J.M.L., Terrada, M.L.: Introducción a la terminología médica. Elsevier, España (2005)
Google Scholar
Abacha, A.B., Zweigenbaum, P.: Medical entity recognition: a comparison of semantic and statistical methods. In: Proceedings of BioNLP 2011 Workshop, pp. 56–64. Association for Computational Linguistics (2011)
Google Scholar
Pacak, M., Pratt, A.: Identification and transformation of terminal morphemes in medical English Part II. Methods Inf. Med. 17(02), 95–100 (1978)
Article Google Scholar
Wolff, S.: The use of morphosemantic regularities in the medical vocabulary for automatic lexical coding. Methods Inf. Med. 23(04), 195–203 (1984)
Article Google Scholar
Pacak, M.G., Norton, L., Dunham, G.S.: Morphosemantic analysis of-itis forms in medical language. Methods Inf. Med. 19(02), 99–105 (1980)
Article Google Scholar
Norton, L., Pacak, M.G.: Morphosemantic analysis of compound word forms denoting surgical procedures. Methods Inf. Med. 22(01), 29–36 (1983)
Article Google Scholar
Dujols, P., Aubas, P., Baylon, C., Grémy, F.: Morpho-semantic analysis and translation of medical compound terms. Methods Inf. Med. 30(1), 30–35 (1991)
Article Google Scholar
Aronson, A.R.: Effective mapping of biomedical text to the UMLS metathesaurus: the metamap program. In: Proceedings of the AMIA Symposium, p. 17. American Medical Informatics Association (2001)
Google Scholar
Hahn, U., Romacker, M., Schulz, S.: Medsyndikate-a natural language system for the extraction of medical information from findings reports. Int. J. Med. Inform. 67(1–3), 63–74 (2002)
Article Google Scholar
Isozaki, H., Kazawa, H.: Efficient support vector classifiers for named entity recognition. In: Proceedings of the 19th international conference on Computational linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)
Google Scholar
He, Y., Kayaalp, M.: Biological entity recognition with conditional random fields. In: AMIA Annual Symposium Proceedings, vol. 2008, p. 293. American Medical Informatics Association (2008)
Google Scholar
Finkel, J., Dingare, S., Nguyen, H., Nissim, M., Manning, C., Sinclair, G.: Exploiting context for biomedical entity recognition: from syntax to the web. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications, pp. 88–91. Association for Computational Linguistics (2004)
Google Scholar
de la Villa, M., Aparicio, F., Maña, M.J., de Buenaga, M.: A learning support tool with clinical cases based on concept maps and medical entity recognition. In: Proceedings of the 2012 ACM international conference on Intelligent User Interfaces, pp. 61–70. ACM (2012)
Google Scholar
Khoo, C.S., Chan, S., Niu, Y.: Extracting causal knowledge from a medical database using graphical patterns. In: Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, pp. 336–343. Association for Computational Linguistics (2000)
Google Scholar
Skeppstedt, M., Kvist, M., Dalianis, H.: Rule-based entity recognition and coverage of snomed ct in swedish clinical text. In: LREC, pp. 1250–1257 (2012)
Google Scholar
Proux, D., Rechenmann, F., Julliard, L., Pillet, V., Jacq, B.: Detecting gene symbols and names in biological texts. Genome Inform. 9, 72–80 (1998)
Google Scholar
Liang, T., Shih, P.-K.: Empirical textual mining to protein entities recognition from PubMed corpus. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 56–66. Springer, Heidelberg (2005). https://doi.org/10.1007/11428817_6
Chapter Google Scholar
Roberts, A., Gaizauskas, R.J., Hepple, M., Guo, Y.: Combining terminology resources and statistical methods for entity recognition: an evaluation. In: LREC (2008)
Google Scholar
Silberztein, M.: Formalizing Natural Languages: The NooJ Approach. Wiley, London (2016)
Book Google Scholar

Download references

Acknowledgement

This research has been partly supported by the European Regional Development Fund under the grant KK.01.1.1.01.0009 (DATACROSS).

Author information

Authors and Affiliations

Department of Information and Communication Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, Zagreb, Croatia
Kristina Kocijan
TakeLab ZEMRIS, Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia
Maria Pia di Buono
Department of Classical Philology, University of Zadar, Zadar, Croatia
Linda Mijić

Authors

Kristina Kocijan
View author publications
You can also search for this author in PubMed Google Scholar
Maria Pia di Buono
View author publications
You can also search for this author in PubMed Google Scholar
Linda Mijić
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kristina Kocijan .

Editor information

Editors and Affiliations

University of Palermo, Palermo, Italy
Ignazio Mauro Mirto
University of Salerno, Fisciano, Italy
Mario Monteleone
Université de Franche-Comté, Besancon, France
Max Silberztein

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kocijan, K., di Buono, M.P., Mijić, L. (2019). Detecting Latin-Based Medical Terminology in Croatian Texts. In: Mirto, I., Monteleone, M., Silberztein, M. (eds) Formalizing Natural Languages with NooJ 2018 and Its Natural Language Processing Applications. NooJ 2018. Communications in Computer and Information Science, vol 987. Springer, Cham. https://doi.org/10.1007/978-3-030-10868-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-10868-7_4
Published: 25 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10867-0
Online ISBN: 978-3-030-10868-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics