Abstract
In this paper we present a method for semantic annotation of texts, which is based on a deep linguistic analysis (DLA) and Inductive Logic Programming (ILP). The combination of DLA and ILP have following benefits: Manual selection of learning features is not needed. The learning procedure has full available linguistic information at its disposal and it is capable to select relevant parts itself. Learned extraction rules can be easily visualized, understood and adapted by human. A description, implementation and initial evaluation of the method are the main contributions of the paper.
Chapter PDF
Similar content being viewed by others
Keywords
References
Aitken, S.: Learning information extraction rules: An inductive logic programming approach. In: van Harmelen, F. (ed.) Proceedings of the 15th European Conference on Artificial Intelligence. IOS Press, Amsterdam (2002)
Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Engineering 10(3/4), 349–373 (2004)
Bunescu, R., Mooney, R.: Extracting relations from text: From word sequences to dependency paths. In: Kao, A., Poteet, S.R. (eds.) Natural Language Processing and Text Mining, ch. 3, pp. 29–44. Springer, London (2007)
Buyko, E., Faessler, E., Wermter, J., Hahn, U.: Event extraction from trimmed dependency graphs. In: BioNLP 2009: Proceedings of the Workshop on BioNLP, pp. 19–27. ACL, Morristown (2009)
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the ACL (2002)
Cunningham, H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine. Tech. rep., Department of Computer Science, The University of Sheffield (2000), http://www.dcs.shef.ac.uk/intranet/research/resmes/CS0010.pdf
Dědek, J., Vojtáš, P.: Computing aggregations from linguistic web resources: a case study in czech republic sector/traffic accidents. In: Dini, C. (ed.) Second International Conference on Advanced Engineering Computing and Applications in Sciences, pp. 7–12. IEEE Computer Society, Los Alamitos (2008), http://www2.computer.org/portal/web/csdl/doi/10.1109/ADVCOMP.2008.17
Etzioni, O., Banko, M., Soderland, S., Weld, D.S.: Open information extraction from the web. ACM Commun. 51(12), 68–74 (2008)
Fundel, K., Küffner, R., Zimmer, R.: Relex—relation extraction using dependency parse trees. Bioinformatics 23(3), 365–371 (2007)
Hajič, J., Hajičová, E., Hlaváčová, J., Klimeš, V., Mírovský, J., Pajas, P., Štěpánek, J., Vidová-Hladká, B., Žabokrtský, Z.: Prague dependency treebank 2.0 CD-ROM. In: Linguistic Data Consortium LDC2006T01, Philadelphia (2006)
Hepp, M.: Goodrelations: An ontology for describing products and services offers on the web. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 329–346. Springer, Heidelberg (2008)
Junker, M., Sintek, M., Sintek, M., Rinck, M.: Learning for text categorization and information extraction with ILP. In: Cussens, J., Džeroski, S. (eds.) LLL 1999. LNCS (LNAI), vol. 1925, pp. 84–93. Springer, Heidelberg (2000)
Li, Y., Bontcheva, K., Cunningham, H.: Adapting SVM for Data Sparseness and Imbalance: A Case Study on Information Extraction. Natural Language Engineering 15(02), 241–271 (2009), http://journals.cambridge.org/repo_A45LfkBD
Li, Y., Zaragoza, H., Herbrich, R., Shawe-Taylor, J., Kandola, J.S.: The perceptron algorithm with uneven margins. In: ICML 2002: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 379–386. Morgan Kaufmann Publishers Inc., San Francisco (2002)
Muggleton, S.: Inverse entailment and progol. New Generation Computing, Special issue on Inductive Logic Programming 13(3-4), 245–286 (1995)
Muggleton, S.: Inductive logic programming. New Generation Computing 8(4), 295–318 (1991), http://dx.doi.org/10.1007/BF03037089
Ramakrishnan, G., Joshi, S., Balakrishnan, S., Srinivasan, A.: Using ilp to construct features for information extraction from semi-structured text. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 211–224. Springer, Heidelberg (2008)
Wang, R., Neumann, G.: Recognizing textual entailment using sentence similarity based on dependency tree skeletons. In: RTE 2007: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 36–41. ACL, Morristown (2007)
Yakushiji, A., Tateisi, Y., Miyao, Y., Tsujii, J.: Event extraction from biomedical papers using a full parser. In: Pac. Symp. Biocomput., pp. 408–419 (2001)
Žabokrtský, Z., Ptáček, J., Pajas, P.: TectoMT: Highly modular MT system with tectogrammatics used as transfer layer. In: Proceedings of the 3rd Workshop on Statistical Machine Translation, pp. 167–170. ACL, Columbus (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dědek, J. (2010). Towards Semantic Annotation Supported by Dependency Linguistics and ILP. In: Patel-Schneider, P.F., et al. The Semantic Web – ISWC 2010. ISWC 2010. Lecture Notes in Computer Science, vol 6497. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17749-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-17749-1_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17748-4
Online ISBN: 978-3-642-17749-1
eBook Packages: Computer ScienceComputer Science (R0)