Skip to main content

Information Extraction

  • Chapter
  • First Online:
  • 9997 Accesses

Abstract

In its most basic form, text is a sequence of tokens, which is not annotated with the properties of these tokens. The goal of information extraction is to discover specific types of useful properties of these tokens and their interrelationships relationships.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The bibliographic notes contain pointers to such methods.

  2. 2.

    https://en.wikipedia.org/wiki/Help:Infobox.

  3. 3.

    https://en.wikipedia.org/wiki/John_F._Kennedy.

  4. 4.

    This early series of conferences played an important role in the evolution of the field of information extraction, which had been largely sporadic till then.

  5. 5.

    Note that this approach is slightly different from directly tagging the last token of an entity that a state that indicates its end point. In the approach of the previous section, a person entity would be ended by a token with a PE tag, whereas Nymble simply uses a separate “end” token.

Bibliography

  1. C. Aggarwal. Data mining: The textbook. Springer, 2015.

    Google Scholar 

  2. E. Agichtein and L. Gravano. Snowball: Extracting relations from large plain-text collections. ACM Conference on Digital Libraries, pp. 85–94, 2000.

    Google Scholar 

  3. M. Banko and O. Etzioni. The tradeoffs between open and traditional relation extraction. ACL Conference, pp. 28–36, 2008.

    Google Scholar 

  4. O. Bender, F. Och, and H. Ney. Maximum entropy models for named entity recognition. Conference on Natural Language Learning at HLT-NAACL 2003, pp. 148–51, 2003.

    Google Scholar 

  5. D. Bikel, S. Miller, R. Schwartz, and R. Weischedel. Nymble: a high-performance learning name-finder. Applied Natural Language Processing Conference, pp. 194–201, 1997.

    Google Scholar 

  6. S. Brin. Extracting patterns and relations from the World Wide Web. International Workshop on the Web and Databases, 1998. http://link.springer.com/chapter/10.1007/10704656_11#page-1

  7. R. Bunescu and R. Mooney. A shortest path dependency kernel for relation extraction. Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731, 2005.

    Google Scholar 

  8. R. Bunescu and R. Mooney. Subsequence kernels for relation extraction. NIPS Conference, pp. 171–178, 2005.

    Google Scholar 

  9. M. Califf and R. Mooney. Bottom-up relational learning of pattern matching rules for information extraction. Journal of Machine Learning Research, 4, pp. 177-210, 2003.

    MathSciNet  MATH  Google Scholar 

  10. Y. Chan and D. Roth. Exploiting syntactico-semantic structures for relation extraction. ACL Conference: Human Language Technologies, pp. 551–560, 2011.

    Google Scholar 

  11. F. Ciravegna. Adaptive information extraction from text by rule induction and generalisation. International Joint Conference on Artificial Intelligence, 17(1), pp. 1251–1256, 2001.

    Google Scholar 

  12. M. Collins and N. Duffy. Convolution kernels for natural language. NIPS Conference, pp. 625–632, 2001.

    Google Scholar 

  13. S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. EMNLP-CoNLL, pp. 708–716, 2007.

    Google Scholar 

  14. A. Culotta and J. Sorensen. Dependency tree kernels for relation extraction. ACL Conference, 2004.

    Google Scholar 

  15. J. Curran and S. Clark. Language independent NER using a maximum entropy tagger. Conference on Natural Language Learning at HLT-NAACL 2003, pp. 164–167, 2003.

    Google Scholar 

  16. G. DeJong. Prediction and substantiation: A new approach to natural language processing. Cognitive Science, 3(3), pp. 251–273, 1979.

    Article  Google Scholar 

  17. T. Dietterich. Machine learning for sequential data: A review. Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp. 15–30, 2002.

    Google Scholar 

  18. O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1), pp. 91–134, 2005.

    Article  Google Scholar 

  19. A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545, 2011.

    Google Scholar 

  20. D. Freitag and A. McCallum. Information extraction with HMMs and shrinkage. AAAI-99 Workshop on Machine Learning for Information Extraction, pp. 31–36, 1999.

    Google Scholar 

  21. R. Grishman and B. Sundheim. Message Understanding Conference-6: A Brief History. COLING, pp. 466–471, 1996.

    Google Scholar 

  22. J. Jiang. Information extraction from text. Mining Text Data, Springer, pp. 11–41, 2012.

    Chapter  Google Scholar 

  23. J. Jiang and C. Zhai. A systematic exploration of the feature space for relation extraction. HLT-NAACL, pp. 113–120, 2007.

    Google Scholar 

  24. N. Kambhatla, Combining lexical, syntactic and semantic features with maximum entropy models for information extraction. ACL Conference, pp. 178–181, 2004.

    Google Scholar 

  25. A. Krogh, M. Brown, I. Mian, K. Sjolander, and D. Haussler. Hidden Markov models in computational biology: Applications to protein modeling. Journal of Molecular Biology, 235(5), pp. 1501–1531, 1994.

    Article  Google Scholar 

  26. J. Kupiec. Robust part-of-speech tagging using a hidden Markov model. Computer Speech and Language, 6(3), pp. 225–242, 1992.

    Article  Google Scholar 

  27. J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML Conference, pp. 282–289, 2001.

    Google Scholar 

  28. B. Liu. Web data mining: exploring hyperlinks, contents, and usage data. Springer, New York, 2007.

    Google Scholar 

  29. R. Malouf. A comparison of algorithms for maximum entropy parameter estimation. Conference on Natural Language Learning, pp. 1–7, 2002.

    Google Scholar 

  30. C. Manning and H. Schütze. Foundations of statistical natural language processing. MIT Press, 1999.

    Google Scholar 

  31. A. McCallum, D. Freitag, and F. Pereira. Maximum entropy Markov models for information extraction and segmentation. ICML Conference, pp. 591–598, 2000.

    Google Scholar 

  32. M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Distant supervision for relation extraction without labeled data. Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, pp. 1003–1011, 2009.

    Google Scholar 

  33. T. Nguyen and A, Moschitti. End-to-end relation extraction using distant supervision from external semantic repositories. ACL Conference, pp. 277–282, 2011.

    Google Scholar 

  34. L. Qian, G. Zhou, F. Kong, Q. Zhu, and P. Qian. Exploiting constituent dependencies for tree kernel-based semantic relation extraction. International Conference on Computational Linguistics, pp. 697–704, 2008.

    Google Scholar 

  35. L. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), pp. 257–286, 1989.

    Article  Google Scholar 

  36. A. Ratnaparkhi. A maximum entropy model for part-of-speech tagging. Conference on Empirical Methods in Natural Language Processing, pp. 133–142, 1996.

    Google Scholar 

  37. X. Ren, M. Jiang, J. Shang, and J. Han. Contructing Structured Information Networks from Massive Text Corpora (Tutorial), WWW Conference, 2017.

    Google Scholar 

  38. B. Rosenfeld and R. Feldman. Clustering for unsupervised relation identification. ACM CIKM Conference, pp. 411–418, 2007.

    Google Scholar 

  39. S. Sarawagi. Information extraction. Foundations and Trends in Satabases, 1(3), pp. 261–377, 2008.

    Article  Google Scholar 

  40. S. Sarawagi and W. Cohen. Semi-markov conditional random fields for information extraction. NIPS Conference, pp. 1185–1192, 2004.

    Google Scholar 

  41. K. Seymore, A. McCallum, and R. Rosenfeld. Learning hidden Markov model structure for information extraction. AAAI-99 Workshop on Machine Learning for Information Extraction, pp. 37–42, 1999.

    Google Scholar 

  42. Y. Shinyama and S. Sekine. Preemptive information extraction using unrestricted relation discovery. Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 304–311, 2006.

    Google Scholar 

  43. S. Soderland. Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1–3), pp. 233–272, 1999.

    Article  Google Scholar 

  44. C. Sutton and A. McCallum. An introduction to conditional random fields. arXiv preprint, arXiv:1011.4088, 2010. https://arxiv.org/abs/1011.4088

  45. K. Takeuchi and N. Collier. Use of support vector machines in extended named entity recognition. Conference on Natural Language Learning, pp. 1–7, 2002.

    Google Scholar 

  46. D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. Journal of Machine Learning Research, 3. pp. 1083–1106, 2003.

    Google Scholar 

  47. M. Zhang, J. Zhang, and J. Su. Exploring syntactic features for relation extraction using a convolution tree kernel. Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 288-295, 2006.

    Google Scholar 

  48. M. Zhang, J. Zhang, J. Su, and G. Zhou. A composite kernel to extract relations between entities with both flat and structured features. International Conference on Computational Linguistics and the Annual Meeting of the Association for Computational Linguistics, pp. 825–832, 2006.

    Google Scholar 

  49. S. Zhao and R. Grishman. Extracting relations with integrated information using kernel methods. ACL Conference, pp. 419–426, 2005.

    Google Scholar 

  50. http://opennlp.apache.org/index.html

  51. http://nlp.stanford.edu/software/

  52. http://www.nltk.org/

  53. http://www.scs.leeds.ac.uk/amalgam/tagsets/brown.html

  54. https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

  55. http://www.itl.nist.gov/iad/mig/tests/ace

  56. http://www.biocreative.org

  57. http://www.signll.org/conll

  58. http://reverb.cs.washington.edu/

  59. http://knowitall.github.io/ollie/

  60. http://nlp.stanford.edu/software/openie.html

  61. http://mallet.cs.umass.edu/

  62. https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki

  63. http://crf.sourceforge.net/

  64. https://en.wikipedia.org/wiki/ClearForest

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Aggarwal, C.C. (2018). Information Extraction. In: Machine Learning for Text. Springer, Cham. https://doi.org/10.1007/978-3-319-73531-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73531-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73530-6

  • Online ISBN: 978-3-319-73531-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics