Combining Evidence for Automatic Extraction of Terms

  • Boris Dobrov
  • Natalia Loukachevitch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6744)


The paper describes the method of extraction of two-word domain terms combining their features. The features are computed from three sources: the occurrence statistics in a domain-specific text collection, the statistics of global search engines, and a domain-specific thesaurus. The evaluation of the approach is based on the terminology of manually created thesauri. We show that the use of multiple features considerably improves the automatic extraction of domain-specific terms. We compare the quality of the proposed method in two different domains.


term acquisition thesaurus Internet search machine learning 


  1. 1.
    Zhang, Z., Iria, J.: Brewster, Ch., Ciravegna, F.: A Comparative Evaluation of Term Recognition Algorithms. In: Sixth International Language Resources and Evaluation, LREC 2008 (2008)Google Scholar
  2. 2.
    Pecina, P., Schlesinger, P.: Combining association measures for collocation extraction. In: Annual Meeting of the Association for Computational Linguistics, ACL 2006, ACM Press, New York (2006)Google Scholar
  3. 3.
    Dobrov, B., Loukachevitch, N.: Development of Linguistic Ontology on Natural Sciences and Technology. In: Linguistic resources and Evaluation conference, LREC 2006 (2006)Google Scholar
  4. 4.
    Manning, C., Raghavan, P., Shutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  5. 5.
    Daille, B., Gaussier, E., Lang, J.M.: An evaluation of statistics scores for word association. In: Tbilisi Symposium on Logic, Language and Computation, pp. 177–188. CSLI Publications (1998)Google Scholar
  6. 6.
    Nenadic, G., Ananiadou, S., McNaught, J.: Enhancing automatic term recognition through recognition of variation. In: 20th International Conference on Computational Linguistics (COLING 2004), pp. 604–610 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Boris Dobrov
    • 1
  • Natalia Loukachevitch
    • 1
  1. 1.Research Computing Center of Lomonosov Moscow State UniversityMoscowRussia

Personalised recommendations