Abstract
Definition Extraction (DE) is the task to automatically identify definitional knowledge in naturally-occurring text. This task has applications in ontology generation, glossary creation or question answering. Although the traditional approach to DE has been based on hand-crafted pattern-matching rules, recent methods incorporate learning algorithms in order to classify sentences as definitional or non-definitional. This paper presents a supervised approach to Definition Extraction in which only syntactic features derived from dependency relations are used. We model the problem as a classification task where each sentence has to be classified as being or not definitional. We compare our results with two well-known approaches: First, a supervised method based on Word-Class Lattices and second, an unsupervised approach based on mining recurrent patterns. Our competitive results suggest that syntactic information alone can contribute substantially to the development and improvement of DE systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Afzal, N., Mitkov, R., Farzindar, A.: Unsupervised relation extraction using dependency trees for automatic generation of multiple-choice questions. In: Butz, C., Lingras, P. (eds.) Canadian AI 2011. LNCS (LNAI), vol. 6657, pp. 32–43. Springer, Heidelberg (2011)
Bohnet, B.: Very high accuracy and fast dependency parsing is not a contradiction. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 89–97. Association for Computational Linguistics, Stroudsburg (2010)
Bontas, E.P., Mochol, M.: Towards a cost estimation model for ontology engineering. In: Eckstein, R., Tolksdorf, R. (eds.) Berliner XML Tage, pp. 153–160 (2005)
Borg, C., Rosner, M., Pace, G.: Evolutionary algorithms for definition extraction. In: Proceedings of the 1st Workshop in Definition Extraction (2009)
Cui, H., Kan, M.Y., Chua, T.S.: Generic soft pattern models for definitional question answering. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 384–391. ACM (2005)
Degórski, L., Marcińczuk, M., Przepiórkowski, A.: Definition extraction using a sequential combination of baseline grammars and machine learning classifiers. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation, LREC 2008. ELRA, Marrakech (2008)
Del Gaudio, R., Batista, G., Branco, A.: Coping with highly imbalanced datasets: A case study with definition extraction in a multilingual setting. In: Natural Language Engineering, pp. 1–33 (2013)
Espinosa, L.: Towards definition extraction using conditional random fields. In: Proceedings of RANLP 2013 Student Research Workshop, pp. 63–70 (2013)
Fahmi, I., Bouma, G.: Learning to identify definitions using syntactic features. In: Proceedings of the Workshop on Learning Structured Information in Natural Language Applications, pp. 64–71 (2006)
Hacioglu, K.: Semantic role labeling using dependency trees. In: International Conference on Computional Linguistics, COLING (2004)
Jin, Y., Kan, M.Y., Ng, J.P., He, X.: Mining scientific terms and their definitions: A study of the ACL anthology. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 780–790. Association for Computational Linguistics, Seattle (2013)
Malaisé, V., Zweigenbaum, P., Bachimont, B.: Detecting semantic relations between terms in definitions. In: Ananadiou, S., Zweigenbaum, P. (eds.) International Conference on Computational Linguistics (COLING 2004) - CompuTerm 2004: 3rd International Workshop on Computational Terminology, Geneva, Switzerland, pp. 55–62 (2004)
Meyer, I.: Extracting knowledge-rich contexts for terminography. Recent Advances in Computational Terminology 2, 279 (2001)
Miller, G.A.: Wordnet: A lexical database for english. Communications of the ACM 38(11), 39–41 (1995)
Monachesi, P., Westerhout, E.: What can NLP techniques do for eLearning? In: International Conference on Informatics and Systems (INFOS 2008), pp. 150–156 (2008)
Muresan, A., Klavans, J.: A method for automatically building and evaluating dictionary resources. In: Proceedings of the Language Resources and Evaluation Conference, LREC (2002)
Nakamura, J.I., Nagao, M.: Extraction of semantic information from an ordinary english dictionary and its evaluation. In: Proceedings of the 12th Conference on Computational Linguistics, COLING 1988, vol. 2, pp. 459–464. Association for Computational Linguistics, Stroudsburg (1988)
Navigli, R., Velardi, P.: Learning word-class lattices for definition and hypernym extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 1318–1327. Association for Computational Linguistics, Stroudsburg (2010)
Navigli, R., Velardi, P., Ruiz-Martínez, J.M.: An annotated dataset for extracting definitions and hypernyms from the web. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), pp. 3716–3722. European Language Resources Association (ELRA), Valletta (2010)
Nivre, J.: Dependency grammar and dependency parsing. Tech. rep., Växjö University (2005)
Park, Y., Byrd, R.J., Boguraev, B.K.: Automatic Glossary Extraction: Beyond Terminology Identification. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 1–7. Association for Computational Linguistics, Morristown (2002)
Przepiórkowski, A., Spousta, M., Simov, K., Osenova, P., Lemnitzer, L., Kubo, V., Wójtowicz, B.: Towards the automatic extraction of definitions in Slavic. In: Proceedings ofo the BSNLP Workshop at ACL 2007, pp. 43–50 (2007)
Rebeyrolle, J., Tanguy, L.: Repérage automatique de structures linguistiques en corpus: le cas des énoncés définitoires. Cahiers de Grammaire 25, 153–174 (2000)
Reiplinger, M., Schäfer, U., Wolska, M.: Extracting glossary sentences from scholarly articles: A comparative evaluation of pattern bootstrapping and deep analysis. In: Proceedings of the ACL 2012 Special Workshop on Rediscovering 50 Years of Discoveries, pp. 55–65. Association for Computational Linguistics, Jeju Island (2012)
Saggion, H., Gaizauskas, R.: Mining on-line sources for definition knowledge. In: 17th FLAIRS, Miami Bearch, Florida (2004)
Sánchez, A., Márquez, J.: Hacia un sistema de extracción de definiciones en textos jurídicos. In: Actas de la 1er Jornada Venezolana de Investigación en Lingüística e Informática, pp. 1–10 (2005)
Sarmento, L., Maia, B., Santos, D., Pinto, A., Cabral, L.: Corpógrafo V3 From Terminological Aid to Semi-automatic Knowledge Engineering. In: 5th International Conference on Language Resources and Evaluation (LREC 2006), Geneva (2006)
Seppälä, S.: A proposal for a framework to evaluate feature relevance for terminographic definitions. In: Proceedings of the 1st Workshop on Definition Extraction, WDE 2009, pp. 47–53. Association for Computational Linguistics, Stroudsburg (2009)
Sierra, G., Alarcón, R., Aguilar, C., Barrón, A.: Towards the building of a corpus of definitional contexts. In: Proceeding of the 12th EURALEX International Congress, Torino, Italy, pp. 229–240 (2006)
Stevenson, M., Greenwood, M.A.: Comparing information extraction pattern models. In: Proceedings of the Workshop on Information Extraction Beyond The Document, IEBeyondDoc 2006, pp. 12–19. Association for Computational Linguistics, Stroudsburg (2006)
Storrer, A., Wellinghoff, S.: Automated detection and annotation of term definitions in German text corpora. In: Conference on Language Resources and Evaluation (LREC), pp. 275–295 (2006)
Sudo, K., Sekine, S., Grishman, R.: An improved extraction pattern representation model for automatic ie pattern acquisition. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, ACL 2003, Sapporo, Japan (2003)
Szpektor, I., Tanev, H., Dagan, I., Coppola, B.: Barcelona
Walter, S., Pinkal, M.: Automatic extraction of definitions from German court decisions. In: Proceedings of the Workshop on Information Extraction Beyond the Document, pp. 20–28. Association for Computational Linguistics (2006)
Westerhout, E., Monachesi, P.: Extraction of Dutch definitory contexts for elearning purposes. In: Proceedings of the Computational Linguistics in the Netherlands (CLIN 2007), Nijmegen, Netherlands, pp. 219–234 (2007)
Westerhout, E., Monachesi, P.: Creating glossaries using pattern-based and machine learning techniques. In: Proceedings of the 6th Conference on Language Resources and Evaluation (LREC), pp. 3074–3081 (2008)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann (2005)
Yan, Y., Hashimoto, C., Torisawa, K.: Pattern mining approach to unsupervised definition extraction. In: Speech Processing Society 18th Annual Meeting (2005) (in Chinese)
Zhang, C., Jiang, P.: Automatic extraction of definitions. In: 2nd IEEE International Conference on Computer Science and Information Technology, pp. 364–368. IEEE (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Espinosa-Anke, L., Saggion, H. (2014). Applying Dependency Relations to Definition Extraction. In: Métais, E., Roche, M., Teisseire, M. (eds) Natural Language Processing and Information Systems. NLDB 2014. Lecture Notes in Computer Science, vol 8455. Springer, Cham. https://doi.org/10.1007/978-3-319-07983-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-07983-7_10
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07982-0
Online ISBN: 978-3-319-07983-7
eBook Packages: Computer ScienceComputer Science (R0)