Skip to main content

Applying Dependency Relations to Definition Extraction

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8455))

Abstract

Definition Extraction (DE) is the task to automatically identify definitional knowledge in naturally-occurring text. This task has applications in ontology generation, glossary creation or question answering. Although the traditional approach to DE has been based on hand-crafted pattern-matching rules, recent methods incorporate learning algorithms in order to classify sentences as definitional or non-definitional. This paper presents a supervised approach to Definition Extraction in which only syntactic features derived from dependency relations are used. We model the problem as a classification task where each sentence has to be classified as being or not definitional. We compare our results with two well-known approaches: First, a supervised method based on Word-Class Lattices and second, an unsupervised approach based on mining recurrent patterns. Our competitive results suggest that syntactic information alone can contribute substantially to the development and improvement of DE systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Afzal, N., Mitkov, R., Farzindar, A.: Unsupervised relation extraction using dependency trees for automatic generation of multiple-choice questions. In: Butz, C., Lingras, P. (eds.) Canadian AI 2011. LNCS (LNAI), vol. 6657, pp. 32–43. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  2. Bohnet, B.: Very high accuracy and fast dependency parsing is not a contradiction. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 89–97. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  3. Bontas, E.P., Mochol, M.: Towards a cost estimation model for ontology engineering. In: Eckstein, R., Tolksdorf, R. (eds.) Berliner XML Tage, pp. 153–160 (2005)

    Google Scholar 

  4. Borg, C., Rosner, M., Pace, G.: Evolutionary algorithms for definition extraction. In: Proceedings of the 1st Workshop in Definition Extraction (2009)

    Google Scholar 

  5. Cui, H., Kan, M.Y., Chua, T.S.: Generic soft pattern models for definitional question answering. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 384–391. ACM (2005)

    Google Scholar 

  6. Degórski, L., Marcińczuk, M., Przepiórkowski, A.: Definition extraction using a sequential combination of baseline grammars and machine learning classifiers. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation, LREC 2008. ELRA, Marrakech (2008)

    Google Scholar 

  7. Del Gaudio, R., Batista, G., Branco, A.: Coping with highly imbalanced datasets: A case study with definition extraction in a multilingual setting. In: Natural Language Engineering, pp. 1–33 (2013)

    Google Scholar 

  8. Espinosa, L.: Towards definition extraction using conditional random fields. In: Proceedings of RANLP 2013 Student Research Workshop, pp. 63–70 (2013)

    Google Scholar 

  9. Fahmi, I., Bouma, G.: Learning to identify definitions using syntactic features. In: Proceedings of the Workshop on Learning Structured Information in Natural Language Applications, pp. 64–71 (2006)

    Google Scholar 

  10. Hacioglu, K.: Semantic role labeling using dependency trees. In: International Conference on Computional Linguistics, COLING (2004)

    Google Scholar 

  11. Jin, Y., Kan, M.Y., Ng, J.P., He, X.: Mining scientific terms and their definitions: A study of the ACL anthology. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 780–790. Association for Computational Linguistics, Seattle (2013)

    Google Scholar 

  12. Malaisé, V., Zweigenbaum, P., Bachimont, B.: Detecting semantic relations between terms in definitions. In: Ananadiou, S., Zweigenbaum, P. (eds.) International Conference on Computational Linguistics (COLING 2004) - CompuTerm 2004: 3rd International Workshop on Computational Terminology, Geneva, Switzerland, pp. 55–62 (2004)

    Google Scholar 

  13. Meyer, I.: Extracting knowledge-rich contexts for terminography. Recent Advances in Computational Terminology 2, 279 (2001)

    Article  Google Scholar 

  14. Miller, G.A.: Wordnet: A lexical database for english. Communications of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  15. Monachesi, P., Westerhout, E.: What can NLP techniques do for eLearning? In: International Conference on Informatics and Systems (INFOS 2008), pp. 150–156 (2008)

    Google Scholar 

  16. Muresan, A., Klavans, J.: A method for automatically building and evaluating dictionary resources. In: Proceedings of the Language Resources and Evaluation Conference, LREC (2002)

    Google Scholar 

  17. Nakamura, J.I., Nagao, M.: Extraction of semantic information from an ordinary english dictionary and its evaluation. In: Proceedings of the 12th Conference on Computational Linguistics, COLING 1988, vol. 2, pp. 459–464. Association for Computational Linguistics, Stroudsburg (1988)

    Chapter  Google Scholar 

  18. Navigli, R., Velardi, P.: Learning word-class lattices for definition and hypernym extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 1318–1327. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  19. Navigli, R., Velardi, P., Ruiz-Martínez, J.M.: An annotated dataset for extracting definitions and hypernyms from the web. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), pp. 3716–3722. European Language Resources Association (ELRA), Valletta (2010)

    Google Scholar 

  20. Nivre, J.: Dependency grammar and dependency parsing. Tech. rep., Växjö University (2005)

    Google Scholar 

  21. Park, Y., Byrd, R.J., Boguraev, B.K.: Automatic Glossary Extraction: Beyond Terminology Identification. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 1–7. Association for Computational Linguistics, Morristown (2002)

    Chapter  Google Scholar 

  22. Przepiórkowski, A., Spousta, M., Simov, K., Osenova, P., Lemnitzer, L., Kubo, V., Wójtowicz, B.: Towards the automatic extraction of definitions in Slavic. In: Proceedings ofo the BSNLP Workshop at ACL 2007, pp. 43–50 (2007)

    Google Scholar 

  23. Rebeyrolle, J., Tanguy, L.: Repérage automatique de structures linguistiques en corpus: le cas des énoncés définitoires. Cahiers de Grammaire 25, 153–174 (2000)

    Google Scholar 

  24. Reiplinger, M., Schäfer, U., Wolska, M.: Extracting glossary sentences from scholarly articles: A comparative evaluation of pattern bootstrapping and deep analysis. In: Proceedings of the ACL 2012 Special Workshop on Rediscovering 50 Years of Discoveries, pp. 55–65. Association for Computational Linguistics, Jeju Island (2012)

    Google Scholar 

  25. Saggion, H., Gaizauskas, R.: Mining on-line sources for definition knowledge. In: 17th FLAIRS, Miami Bearch, Florida (2004)

    Google Scholar 

  26. Sánchez, A., Márquez, J.: Hacia un sistema de extracción de definiciones en textos jurídicos. In: Actas de la 1er Jornada Venezolana de Investigación en Lingüística e Informática, pp. 1–10 (2005)

    Google Scholar 

  27. Sarmento, L., Maia, B., Santos, D., Pinto, A., Cabral, L.: Corpógrafo V3 From Terminological Aid to Semi-automatic Knowledge Engineering. In: 5th International Conference on Language Resources and Evaluation (LREC 2006), Geneva (2006)

    Google Scholar 

  28. Seppälä, S.: A proposal for a framework to evaluate feature relevance for terminographic definitions. In: Proceedings of the 1st Workshop on Definition Extraction, WDE 2009, pp. 47–53. Association for Computational Linguistics, Stroudsburg (2009)

    Google Scholar 

  29. Sierra, G., Alarcón, R., Aguilar, C., Barrón, A.: Towards the building of a corpus of definitional contexts. In: Proceeding of the 12th EURALEX International Congress, Torino, Italy, pp. 229–240 (2006)

    Google Scholar 

  30. Stevenson, M., Greenwood, M.A.: Comparing information extraction pattern models. In: Proceedings of the Workshop on Information Extraction Beyond The Document, IEBeyondDoc 2006, pp. 12–19. Association for Computational Linguistics, Stroudsburg (2006)

    Chapter  Google Scholar 

  31. Storrer, A., Wellinghoff, S.: Automated detection and annotation of term definitions in German text corpora. In: Conference on Language Resources and Evaluation (LREC), pp. 275–295 (2006)

    Google Scholar 

  32. Sudo, K., Sekine, S., Grishman, R.: An improved extraction pattern representation model for automatic ie pattern acquisition. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, ACL 2003, Sapporo, Japan (2003)

    Google Scholar 

  33. Szpektor, I., Tanev, H., Dagan, I., Coppola, B.: Barcelona

    Google Scholar 

  34. Walter, S., Pinkal, M.: Automatic extraction of definitions from German court decisions. In: Proceedings of the Workshop on Information Extraction Beyond the Document, pp. 20–28. Association for Computational Linguistics (2006)

    Google Scholar 

  35. Westerhout, E., Monachesi, P.: Extraction of Dutch definitory contexts for elearning purposes. In: Proceedings of the Computational Linguistics in the Netherlands (CLIN 2007), Nijmegen, Netherlands, pp. 219–234 (2007)

    Google Scholar 

  36. Westerhout, E., Monachesi, P.: Creating glossaries using pattern-based and machine learning techniques. In: Proceedings of the 6th Conference on Language Resources and Evaluation (LREC), pp. 3074–3081 (2008)

    Google Scholar 

  37. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann (2005)

    Google Scholar 

  38. Yan, Y., Hashimoto, C., Torisawa, K.: Pattern mining approach to unsupervised definition extraction. In: Speech Processing Society 18th Annual Meeting (2005) (in Chinese)

    Google Scholar 

  39. Zhang, C., Jiang, P.: Automatic extraction of definitions. In: 2nd IEEE International Conference on Computer Science and Information Technology, pp. 364–368. IEEE (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Espinosa-Anke, L., Saggion, H. (2014). Applying Dependency Relations to Definition Extraction. In: Métais, E., Roche, M., Teisseire, M. (eds) Natural Language Processing and Information Systems. NLDB 2014. Lecture Notes in Computer Science, vol 8455. Springer, Cham. https://doi.org/10.1007/978-3-319-07983-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07983-7_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07982-0

  • Online ISBN: 978-3-319-07983-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics