Abstract
Acquisition of patterns for information extraction systems is a common task in Natural Language Processing, mostly based on manual analysis of text corpora. We have developed a system called Prométhée, which incrementally extracts lexico-syntactic patterns for a specific conceptual relation from a technical corpus. However, these patterns are often too general and need to be manually validated.
In this paper, we demonstrate how Prométhée has been interfaced with the machine learning system Eagle in order to automatically refine the patterns it produces. The empirical results obtained with this technique show that the refined patterns allows to decrease the need for the human validation.
We would like to thank C. Jacquemin and M. Quafafou for helpful discussions on this work.
Chapter PDF
Similar content being viewed by others
Keywords
- Noun Phrase
- Natural Language Processing
- Inductive Logic Programming
- Syntactic Pattern
- Inductive Logic Programming System
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Shlomo Argamon, Ido Dagan, and Yuval Krymolowski. A memory-based approach to learning shallow natural language patterns. In Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL’98), Montreal, Canada, 1998. 298
Andrée Borillo. Exploration automatisée de textes de spécialité: repérage et identification de la relation lexicale d’hyperonymie. LINX, 34/35:113–124, 1996. 296
Mary Elaine Califf and Raymond J. Mooney. Relational learning of pattern-match rules for information extraction. In Proceedings of the Computational Natural Language Learning (CoNLL’97), pages 9–15, Madrid, Spain, July 1997. 298
Marti A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING’92), pages 539–545, Nantes, France, 1992. 294
Marti A. Hearst. Automated Discovery of WordNet Relations. In Christiane Fellbaum, editor, WordNet: An Electronic Lexical Database, pages 131–151. MIT Press, Cambridge, MA, 1998. 294
Jerry R. Hobbs, Douglas E. Appelt, John S. Bear, David J. Israel, and W. Mabry Tyson. FASTUS: A system for extracting information from natural language text. Technical Report 515, SRI International, Menlo Park, CA, november 1992. 292
Scott B. Huffman. Learning information extraction patterns from examples. In Workshop New Approaches to Learning for Natural Language Processing at IJCAI’95, pages 127–133, Montreal, 1995. 293, 298
Emmanuelle Martienne and Mohamed Quafafou. Learning Logical Descriptions for Document Understanding: a Rough Sets-based Approach. In Proceedings of the first International Conference on Rough Sets and Current Trends in Computing (RSCTC’98), pages 22–26, Warsaw, Pologne, june 1998. 293
George A. Miller, Richard Beckwith, Christiane Fellbaum, Derek Gross, and Miller Katherine. Introduction to WordNet: An on-line lexical database. Journal of Lexicography, 3:235–244, 1990. 294
Emmanuel Morin. Extraction de liens sémantiques entre termes à partir de corpus de textes techniques. Thèse en informatique, Université de Nantes, December 1999. 293, 295
Stephen Muggleton. Inductive logic programming. New Generation Computing, 8:295–318, 1991. 293
Lance A. Ramshaw and Mitchell P. Marcus. Text Chunking using Transformation-Based Learning. In Proceedings of the Third Workshop on Very Large Corpora, pages 811–816, 1995. 298
Ellen Riloff. Automatically Constructing a Dictionary for Information Extraction Tasks. In Proceedings of the 11th National Conference on Artificial Intelligence (AAAI’93), pages 811–816, Menlo Park, CA, USA, July 1993. 293, 298
Ellen Riloff. Automatically generating extraction from untagged text. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI’96), pages 1044–1049, august 1996. 293, 298
Stephen Soderland, David Fisher, Jonathan Aseltine, and Wendy Lehnert. CRYSTAL: Inducing a Conceptual Dictionay. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI’95), pages 1314–1319, august 1995. 293, 298
Marc Vilain and David Day. Finite-state phrase parsing by rule sequences. In Proceedings of the 16th International Conference on Computational Linguistics (COLING’96), pages 274–279, Copenhagen, Denmark, 1996. 298
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Morin, E., Martienne, E. (2000). Using a Symbolic Machine Learning Tool to Refine Lexico-syntactic Patterns. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_31
Download citation
DOI: https://doi.org/10.1007/3-540-45164-1_31
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive