Abstract
A growing population of users want to extract a growing variety of information from on-line texts. Unfortunately, current information extraction systems typically require experts to hand-build dictionaries of extraction patterns for each new type of information to be extracted. This paper presents a system that can learn dictionaries of extraction patterns directly from user-provided examples of texts and events to be extracted from them. The system, called LIEP, learns patterns that recognize relationships between key constituents based on local syntax. Sets of patterns learned by LIEP for a sample extraction task perform nearly at the level of a hand-built dictionary of patterns.
Preview
Unable to display preview. Download preview PDF.
References
E. Brill. Some advances in transformation-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), pages T22-7. 1994.
N. Chinchor and B. Sundheim. MUC-5 evaluation metrics. In Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
R. J. Hall. Learning by failing to explain. Machine Learning, 3(1):45–77, 1988.
J. R. Hobbs, D. E. Appelt, J. S. Bear, D. J. Israel, and W. Mabry Tyson. FASTUS: A system for extracting information from natural-language text. Technical Report No. 519, SRI International, November 1992.
J. R. Hobbs. The generic information extraction system. In Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
W. Lehnert, J. McCarthy, S. Soderland, E. Riloff, C. Cardie, J. Peterson, F. Feng, C. Dolan, and S. Goldman. UMass/Hughes: Description of the CIRCUS system used for MUC-5. In Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
George Miller. Five papers on WordNet. International Journal of Lexicography, 3:235–312, 1990.
T. M. Mitchell, R. M. Keller, and S. T. Kedar-Cabelli. Explanation-based generalization: A unifying view. Machine Learning, 1, 1986.
Proceedings of the Fourth Message Understanding Conference (MUC-4). Morgan Kaufmann, San Mateo, CA, 1992.
Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
M. Pazzani. Learning to predict and explain: An integration of similarity-based, theory driven, and explanation-based learning. Journal of the Learning Sciences, 1(2):153–199, 1991.
E. Riloff. Automatically constructing a dictionary for information extraction tasks. In Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI-93), pages 811–16. 1993.
S. Soderland and W. Lehnert. Wrap-Up: A trainable discourse module for information extraction. Journal of Artificial Intelligence Research (JAIR), 2:131–158, 1994.
S. Soderland, D. Fisher, J. Aseltine, and W. Lehnert. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI-95), pages 1314–9. Morgan Kaufmann, 1995.
K. VanLehn. Learning one subprocedure per lesson. Artificial Intelligence, 31(1):1–40, 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huffman, S.B. (1996). Learning information extraction patterns from examples. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_51
Download citation
DOI: https://doi.org/10.1007/3-540-60925-3_51
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive