Learning information extraction patterns from examples

Huffman, Scott B.

doi:10.1007/3-540-60925-3_51

Scott B. Huffman¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1040))

Included in the following conference series:

International Joint Conference on Artificial Intelligence

380 Accesses
39 Citations

Abstract

A growing population of users want to extract a growing variety of information from on-line texts. Unfortunately, current information extraction systems typically require experts to hand-build dictionaries of extraction patterns for each new type of information to be extracted. This paper presents a system that can learn dictionaries of extraction patterns directly from user-provided examples of texts and events to be extracted from them. The system, called LIEP, learns patterns that recognize relationships between key constituents based on local syntax. Sets of patterns learned by LIEP for a sample extraction task perform nearly at the level of a hand-built dictionary of patterns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Brill. Some advances in transformation-based part of speech tagging. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), pages T22-7. 1994.
Google Scholar
N. Chinchor and B. Sundheim. MUC-5 evaluation metrics. In Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
R. J. Hall. Learning by failing to explain. Machine Learning, 3(1):45–77, 1988.
Google Scholar
J. R. Hobbs, D. E. Appelt, J. S. Bear, D. J. Israel, and W. Mabry Tyson. FASTUS: A system for extracting information from natural-language text. Technical Report No. 519, SRI International, November 1992.
Google Scholar
J. R. Hobbs. The generic information extraction system. In Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
W. Lehnert, J. McCarthy, S. Soderland, E. Riloff, C. Cardie, J. Peterson, F. Feng, C. Dolan, and S. Goldman. UMass/Hughes: Description of the CIRCUS system used for MUC-5. In Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
George Miller. Five papers on WordNet. International Journal of Lexicography, 3:235–312, 1990.
Google Scholar
T. M. Mitchell, R. M. Keller, and S. T. Kedar-Cabelli. Explanation-based generalization: A unifying view. Machine Learning, 1, 1986.
Google Scholar
Proceedings of the Fourth Message Understanding Conference (MUC-4). Morgan Kaufmann, San Mateo, CA, 1992.
Google Scholar
Proceedings of the Fifth Message Understanding Conference (MUC-5). Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
M. Pazzani. Learning to predict and explain: An integration of similarity-based, theory driven, and explanation-based learning. Journal of the Learning Sciences, 1(2):153–199, 1991.
Google Scholar
E. Riloff. Automatically constructing a dictionary for information extraction tasks. In Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI-93), pages 811–16. 1993.
Google Scholar
S. Soderland and W. Lehnert. Wrap-Up: A trainable discourse module for information extraction. Journal of Artificial Intelligence Research (JAIR), 2:131–158, 1994.
Google Scholar
S. Soderland, D. Fisher, J. Aseltine, and W. Lehnert. CRYSTAL: Inducing a conceptual dictionary. In Proceedings of 14th International Joint Conference on Artificial Intelligence (IJCAI-95), pages 1314–9. Morgan Kaufmann, 1995.
Google Scholar
K. VanLehn. Learning one subprocedure per lesson. Artificial Intelligence, 31(1):1–40, 1987.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Price Waterhouse Technology Centre, 68 Willow Road, 94025, Menlo Park, CA, USA
Scott B. Huffman

Authors

Scott B. Huffman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Stefan Wermter Ellen Riloff Gabriele Scheler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huffman, S.B. (1996). Learning information extraction patterns from examples. In: Wermter, S., Riloff, E., Scheler, G. (eds) Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing. IJCAI 1995. Lecture Notes in Computer Science, vol 1040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60925-3_51

Download citation

DOI: https://doi.org/10.1007/3-540-60925-3_51
Published: 07 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60925-4
Online ISBN: 978-3-540-49738-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics