Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Information Extraction

Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_204

Definition

Information Extraction (IE) is a task of extracting pre-specified types of facts from written texts or speech transcripts, and converting them into structured representations (e.g., databases).

IE terminologies are explained via an example as follows.
  • Input Sentence:

Media tycoon Barry Diller on Wednesday quit as chief of Vivendi Universal Entertainment, the entertainment unit of French giant Vivendi Universal whose future appears up for grabs.
  • IE output:
    • Entities:
      • Person Entity: {Media tycoon, Barry Diller}

      • Organization Entity: {Vivendi Universal Entertainment, the entertainment unit}

      • Organization Entity: {French giant, Vivendi Universal}

    • “Part-Whole” relation:
      • {Vivendi Universal Entertainment, the entertainment unit} is part of {French giant, Vivendi Universal}.

    • “End-Position” event.

The above sentence includes a “Personnel_End-Position” event mention, with the trigger word which most clearly expresses the event occurrence, the position, the person who quit the position,...
This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Bikel DM, Miller S, Schwartz R, Weischedel R. Nymble: a high-performance learning name-finder. In: Proceedings of the 5th Conference on Applied Natural Language Processing; 1997. p. 194–201.Google Scholar
  2. 2.
    Boschee E, Weischedel R, Zamanian A. Automatic evidence extraction. In: Proceedings of the International Conference on Intelligence Analysis; 2005.Google Scholar
  3. 3.
    Florian R, Jing H, Kambhatla N, Zitouni I. Factorizing complex models: a case study in mention detection. In: Proceedings of the 26th international conference on computational linguistics. 2006. p. 473–80.Google Scholar
  4. 4.
    Grishman R, Sundheim B. Message understanding conference – 6: a brief history. In: Proceedings of the 16th international conference on computational linguistics. 1996. p. 466–71.Google Scholar
  5. 5.
    Grishman R, Westbrook D, Meyers A. NYU’s English ACE 2005 system description. In: Proceedings of the ACE 2005 evaluation/PI workshop. 2005.Google Scholar
  6. 6.
    Ji H, Grishman R. Refining event extraction through unsupervised cross-document inference. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics; 2008. p. 254–62.Google Scholar
  7. 7.
    Ji H, Westbrook D, Grishman R. Using semantic relations to refine coreference decisions. In: Proceedings of the Conference Human Language Technology and Empirical Methods in Natural Language Processing; 2005. p. 17–24.Google Scholar
  8. 8.
    Muslea I. Extraction patterns for information extraction tasks: a survey. In: Proceedings of the National Conference on Artificial Intelligence (AAAI-99) Workshop on Machine Learning for Information Extraction; 1999.Google Scholar
  9. 9.
    Ng V, Cardie C. Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics; 2002. p. 104–11.Google Scholar
  10. 10.
    Riloff E. Automatically generating extraction patterns from untagged text. In: Proceedings of the 10th national conference on AI. 1996. p. 1044–49.Google Scholar
  11. 11.
    Sager N. Natural language information processing: a computer grammar of english and its applications. Reading: Addison Wesley; 1981.Google Scholar
  12. 12.
    Sudo K, Sekine S, Grishman R. An improved extraction pattern representation model for automatic IE pattern acquisition. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics; 2003. p. 224–31.Google Scholar
  13. 13.
    Yangarber R, Grishman R, Tapanainen P. and Huttunen S. Automatic acquisition of domain knowledge for information extraction. In: Proceedings of the 20th international conference on computational linguistics. 2000. p. 940–46.Google Scholar
  14. 14.
    Zhou G, Su J, Zhang J, Zhang M. Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics; 2005. p. 427–34.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.New York UniversityNew YorkUSA

Section editors and affiliations

  • Zheng Chen
    • 1
  1. 1.Microsoft Research AsiaMicrosoft CorporationBeijingChina