Abstract
Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. Facing the difficulty or impossibility to customize existing tools, we developed a tunable term extractor. It exploits linguistic-based rules in combination with the reuse of existing terminologies, i.e. exogenous disambiguation. Experiments reported here show that the combination of the two strategies allows the extraction of a greater number of term candidates with a higher level of reliability. We further describe the extraction process involving both endogenous and exogenous disambiguation implemented in the term extractor \(\rm Y\kern-.36em \lower.7ex\hbox{A}\kern-.25em T\kern-.1667em\lower.7ex\hbox{E}\kern-.08emA\).
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Daille, B.: Conceptual structuring through term variations. In: Bond, F., Kohonen, A., Carthy, D.M., Villaciencio, A. (eds.) Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition, and Treatment, pp. 9–16 (2003)
Bourigault, D.: An endogeneous corpus-based method for structural noun phrase disambiguation. In: Proceedings of the EACL 1993, Utrecht, The Netherlands, pp. 81–86 (1993)
Bourigault, D., Fabre, C.: Approche linguistique pour l’analyse syntaxique de corpus. Cahiers de Grammaire (25), 131–151 (2000)
Cabré, M.T., Estopà, R., Vivaldi, J.: Automatic term detection: a review of current systems. In: Recent Advances in Computational Terminology, John Benjamins, Amsterdam, Philadelphia (2001)
Aubin, S.: Recommandations sur l’utilisation des outils terminologiques. Technical report, Projet ExtraPloDocs (2003), http://www-lipn.univ-paris13.fr/~poibeau/Extra/D31b.pdf
Chute, C.G., Cohn, S.P., Campbell, K.E., Olivier, D.E., Campbell, J.R.: The content coverage of clinical classifications. Journal of American Medical Informatics Association 3, 224–233 (1996)
McCray, A.T., Browne, A.C., Bodenreider, O.: The lexical properties of the gene ontology (GO). In: Proceedings of the AMIA 2002 Annual Symposium, pp. 504–508 (2002)
Bodenreider, O., Rindflesch, T.C., Burgun, A.: Unsupervised, corpus-based method for extending a biomedical terminology. In: Workshop on Natural Language Processing in the Biomedical Domain (ACL 2002), pp. 53–60 (2002)
Hamon, T.: Indexer les documents spécialisés : les ressources terminologiques contrôlées sont-elles suffisantes? In: 6eme rencontres Terminologie et Intelligence Artificielle, Rouen, France, pp. 71–82 (2005)
Enguehard, C., Malvache, P., Trigano, P.: Indexation de textes: l’apprentissage des concepts. In: Proceedings of COLING 1992, Nantes, France, pp. 1197–1202 (1992)
Jacquemin, C., Klavans, J.L., Tzoukermann, E.: Expansion of multi-word terms for indexing and retrieval using morphology and syntax. In: Proceedings of the ACL 1997/EACL 1997, Barcelona, Spain, pp. 24–31 (1997)
Tsuruoka, Y., Tateishi, Y., Kim, J.-D., Ohta, T., McNaught, J., Ananiadou, S., Tsujii, J.: Developing a robust part-of-speech tagger for biomedical text. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 382–392. Springer, Heidelberg (2005)
Consortium, T.G.O.: Gene ontology: tool for the unification of biology. Nature genetics 25, 25–29 (2000)
MeSH: Medical subject headings. Library of Medicine, Bethesda, Maryland (1998), WWW page: http://www.nlm.nih.gov/mesh/meshhome.html
National Library of Medicine (ed.): UMLS Knowledge Source, 13th edn. (2003)
Consortium, T.G.O.: Creating the Gene Ontology Resource: Design and Implementation. Genome Res. 11(8), 1425–1433 (2001)
Côté, R.A.: Répertoire d’anatomopathologie de la SNOMED internationale, v3.4. Université de Sherbrooke, Sherbrooke, Québec (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aubin, S., Hamon, T. (2006). Improving Term Extraction with Terminological Resources. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds) Advances in Natural Language Processing. FinTAL 2006. Lecture Notes in Computer Science(), vol 4139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11816508_39
Download citation
DOI: https://doi.org/10.1007/11816508_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37334-6
Online ISBN: 978-3-540-37336-0
eBook Packages: Computer ScienceComputer Science (R0)