Abstract
In its most basic form, text is a sequence of tokens, which is not annotated with the properties of these tokens. The goal of information extraction is to discover specific types of useful properties of these tokens and their interrelationships relationships.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The bibliographic notes contain pointers to such methods.
- 2.
- 3.
- 4.
This early series of conferences played an important role in the evolution of the field of information extraction, which had been largely sporadic till then.
- 5.
Note that this approach is slightly different from directly tagging the last token of an entity that a state that indicates its end point. In the approach of the previous section, a person entity would be ended by a token with a PE tag, whereas Nymble simply uses a separate “end” token.
Bibliography
C. Aggarwal. Data mining: The textbook. Springer, 2015.
E. Agichtein and L. Gravano. Snowball: Extracting relations from large plain-text collections. ACM Conference on Digital Libraries, pp. 85–94, 2000.
M. Banko and O. Etzioni. The tradeoffs between open and traditional relation extraction. ACL Conference, pp. 28–36, 2008.
O. Bender, F. Och, and H. Ney. Maximum entropy models for named entity recognition. Conference on Natural Language Learning at HLT-NAACL 2003, pp. 148–51, 2003.
D. Bikel, S. Miller, R. Schwartz, and R. Weischedel. Nymble: a high-performance learning name-finder. Applied Natural Language Processing Conference, pp. 194–201, 1997.
S. Brin. Extracting patterns and relations from the World Wide Web. International Workshop on the Web and Databases, 1998. http://link.springer.com/chapter/10.1007/10704656_11#page-1
R. Bunescu and R. Mooney. A shortest path dependency kernel for relation extraction. Human Language Technology and Empirical Methods in Natural Language Processing, pp. 724–731, 2005.
R. Bunescu and R. Mooney. Subsequence kernels for relation extraction. NIPS Conference, pp. 171–178, 2005.
M. Califf and R. Mooney. Bottom-up relational learning of pattern matching rules for information extraction. Journal of Machine Learning Research, 4, pp. 177-210, 2003.
Y. Chan and D. Roth. Exploiting syntactico-semantic structures for relation extraction. ACL Conference: Human Language Technologies, pp. 551–560, 2011.
F. Ciravegna. Adaptive information extraction from text by rule induction and generalisation. International Joint Conference on Artificial Intelligence, 17(1), pp. 1251–1256, 2001.
M. Collins and N. Duffy. Convolution kernels for natural language. NIPS Conference, pp. 625–632, 2001.
S. Cucerzan. Large-scale named entity disambiguation based on Wikipedia data. EMNLP-CoNLL, pp. 708–716, 2007.
A. Culotta and J. Sorensen. Dependency tree kernels for relation extraction. ACL Conference, 2004.
J. Curran and S. Clark. Language independent NER using a maximum entropy tagger. Conference on Natural Language Learning at HLT-NAACL 2003, pp. 164–167, 2003.
G. DeJong. Prediction and substantiation: A new approach to natural language processing. Cognitive Science, 3(3), pp. 251–273, 1979.
T. Dietterich. Machine learning for sequential data: A review. Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pp. 15–30, 2002.
O. Etzioni, M. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderland, D. Weld, and A. Yates. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1), pp. 91–134, 2005.
A. Fader, S. Soderland, and O. Etzioni. Identifying relations for open information extraction. Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545, 2011.
D. Freitag and A. McCallum. Information extraction with HMMs and shrinkage. AAAI-99 Workshop on Machine Learning for Information Extraction, pp. 31–36, 1999.
R. Grishman and B. Sundheim. Message Understanding Conference-6: A Brief History. COLING, pp. 466–471, 1996.
J. Jiang. Information extraction from text. Mining Text Data, Springer, pp. 11–41, 2012.
J. Jiang and C. Zhai. A systematic exploration of the feature space for relation extraction. HLT-NAACL, pp. 113–120, 2007.
N. Kambhatla, Combining lexical, syntactic and semantic features with maximum entropy models for information extraction. ACL Conference, pp. 178–181, 2004.
A. Krogh, M. Brown, I. Mian, K. Sjolander, and D. Haussler. Hidden Markov models in computational biology: Applications to protein modeling. Journal of Molecular Biology, 235(5), pp. 1501–1531, 1994.
J. Kupiec. Robust part-of-speech tagging using a hidden Markov model. Computer Speech and Language, 6(3), pp. 225–242, 1992.
J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML Conference, pp. 282–289, 2001.
B. Liu. Web data mining: exploring hyperlinks, contents, and usage data. Springer, New York, 2007.
R. Malouf. A comparison of algorithms for maximum entropy parameter estimation. Conference on Natural Language Learning, pp. 1–7, 2002.
C. Manning and H. Schütze. Foundations of statistical natural language processing. MIT Press, 1999.
A. McCallum, D. Freitag, and F. Pereira. Maximum entropy Markov models for information extraction and segmentation. ICML Conference, pp. 591–598, 2000.
M. Mintz, S. Bills, R. Snow, and D. Jurafsky. Distant supervision for relation extraction without labeled data. Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, pp. 1003–1011, 2009.
T. Nguyen and A, Moschitti. End-to-end relation extraction using distant supervision from external semantic repositories. ACL Conference, pp. 277–282, 2011.
L. Qian, G. Zhou, F. Kong, Q. Zhu, and P. Qian. Exploiting constituent dependencies for tree kernel-based semantic relation extraction. International Conference on Computational Linguistics, pp. 697–704, 2008.
L. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), pp. 257–286, 1989.
A. Ratnaparkhi. A maximum entropy model for part-of-speech tagging. Conference on Empirical Methods in Natural Language Processing, pp. 133–142, 1996.
X. Ren, M. Jiang, J. Shang, and J. Han. Contructing Structured Information Networks from Massive Text Corpora (Tutorial), WWW Conference, 2017.
B. Rosenfeld and R. Feldman. Clustering for unsupervised relation identification. ACM CIKM Conference, pp. 411–418, 2007.
S. Sarawagi. Information extraction. Foundations and Trends in Satabases, 1(3), pp. 261–377, 2008.
S. Sarawagi and W. Cohen. Semi-markov conditional random fields for information extraction. NIPS Conference, pp. 1185–1192, 2004.
K. Seymore, A. McCallum, and R. Rosenfeld. Learning hidden Markov model structure for information extraction. AAAI-99 Workshop on Machine Learning for Information Extraction, pp. 37–42, 1999.
Y. Shinyama and S. Sekine. Preemptive information extraction using unrestricted relation discovery. Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 304–311, 2006.
S. Soderland. Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1–3), pp. 233–272, 1999.
C. Sutton and A. McCallum. An introduction to conditional random fields. arXiv preprint, arXiv:1011.4088, 2010. https://arxiv.org/abs/1011.4088
K. Takeuchi and N. Collier. Use of support vector machines in extended named entity recognition. Conference on Natural Language Learning, pp. 1–7, 2002.
D. Zelenko, C. Aone, and A. Richardella. Kernel methods for relation extraction. Journal of Machine Learning Research, 3. pp. 1083–1106, 2003.
M. Zhang, J. Zhang, and J. Su. Exploring syntactic features for relation extraction using a convolution tree kernel. Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pp. 288-295, 2006.
M. Zhang, J. Zhang, J. Su, and G. Zhou. A composite kernel to extract relations between entities with both flat and structured features. International Conference on Computational Linguistics and the Annual Meeting of the Association for Computational Linguistics, pp. 825–832, 2006.
S. Zhao and R. Grishman. Extracting relations with integrated information using kernel methods. ACL Conference, pp. 419–426, 2005.
https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Aggarwal, C.C. (2018). Information Extraction. In: Machine Learning for Text. Springer, Cham. https://doi.org/10.1007/978-3-319-73531-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-73531-3_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73530-6
Online ISBN: 978-3-319-73531-3
eBook Packages: Computer ScienceComputer Science (R0)