Deep Distant Supervision: Learning Statistical Relational Models for Weak Supervision in Natural Language Extraction

Natarajan, Sriraam; Soni, Ameet; Wazalwar, Anurag; Viswanathan, Dileep; Kersting, Kristian

doi:10.1007/978-3-319-41706-6_18

Sriraam Natarajan¹⁶,
Ameet Soni¹⁷,
Anurag Wazalwar¹⁶,
Dileep Viswanathan¹⁶ &
…
Kristian Kersting¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9580))

1608 Accesses

Abstract

One of the challenges to information extraction is the requirement of human annotated examples, commonly called gold-standard examples. Many successful approaches alleviate this problem by employing some form of distant supervision i.e., look into knowledge bases such as Freebase as a source of supervision to create more examples. While this is perfectly reasonable, most distant supervision methods rely on a given set of propositions as a source of supervision. We propose a different approach: we infer weakly supervised examples for relations from statistical relational models learned by using knowledge outside the natural language task. We argue that this deep distant supervision creates more robust examples that are particularly useful when learning the entire model (the structure and parameters). We demonstrate on several domains that this form of weak supervision yields superior results when learning structure compared to using distant supervision labels or a smaller set of labels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.nist.gov/tac/2015/KBP/ColdStart/index.html.
2.
The ratio is actually a log-odds of their weights. We refer to the book for more details [4].
3.
With probabilistic training examples, it can be shown that minimizing the KL-divergence between the examples and the current model gives true \(probability - \) predicted probability as the gradient. This has the similar effect of pushing the predicted probabilities closer to the true probabilities.

References

Bell, B., Koren, Y., Volinsky, C.: The bellkor solution to the netflix grand prize (2009)
Google Scholar
Craven, M., Kumlien, J.: Constructing biological knowledge bases by extracting information from text sources. In: ISMB (1999)
Google Scholar
Devlin, S., Kudenko, D., Grzes, M.: An empirical study of potential-based reward shaping and advice in complex, multi-agent systems. Adv. Complex Syst. 14(2), 251–278 (2011)
Article MathSciNet Google Scholar
Domingos, P., Lowd, D.: Markov Logic: An Interface Layer for AI. Morgan & Claypool, San Rafael (2009)
MATH Google Scholar
Hoffmann, R., Zhang, C., Ling, X., Zettlemoyer, L., Weld, D.S.: Knowledge-based weak supervision for information extraction of overlapping relations. In: ACL (2011)
Google Scholar
Kersting, K., Driessens, K.: Non-parametric policy gradients: a unified treatment of propositional and relational domains. In: ICML (2008)
Google Scholar
Khot, T., Natarajan, S., Kersting, K., Shavlik, J.: Learning Markov logic networks via functional gradient boosting. In: ICDM (2011)
Google Scholar
Kim, J., Ohta, T., Pyysalo, S., Kano, Y., Tsujii, J.: Overview of BioNLP’09 shared task on event extraction. In: BioNLP Workshop Companion Volume for Shared Task (2009)
Google Scholar
Kuhlmann, G., Stone, P., Mooney, R.J., Shavlik, J.W.: Guiding a reinforcement learner with natural language advice: initial results in robocup soccer. In: AAAI Workshop on Supervisory Control of Learning and Adaptive Systems (2004)
Google Scholar
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: ACL and AFNLP (2009)
Google Scholar
Natarajan, S., Kersting, K., Khot, T., Shavlik, J.: Boosted Statistical Relational Learners: From Benchmarks to Data-Driven Medicine. SpringerBriefs in Computer Science. Springer, Heidelberg (2015)
MATH Google Scholar
Natarajan, S., Khot, T., Kersting, K., Guttmann, B., Shavlik, J.: Gradient-based boosting for statistical relational learning: the relational dependency network case. Mach. Learn. 86(1), 25–56 (2012)
Article MathSciNet MATH Google Scholar
Natarajan, S., Picado, J., Khot, T., Kersting, K., Re, C., Shavlik, J.: Effectively creating weakly labeled training examples via approximate domain knowledge. In: Davis, J., Ramon, J. (eds.) ILP 2014. LNCS, vol. 9046, pp. 92–107. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23708-4_7
Chapter Google Scholar
Neville, J., Jensen, D.: Relational dependency networks. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning, pp. 653–692. MIT Press, Cambridge (2007)
Google Scholar
Niu, F., Ré, C., Doan, A., Shavlik, J.W.: Tuffy: scaling up statistical inference in Markov logic networks using an RDBMS. PVLDB 4(6), 373–384 (2011)
Google Scholar
Poon, H., Vanderwende, L.: Joint inference for knowledge extraction from biomedical literature. In: NAACL (2010)
Google Scholar
Raghavan, S., Mooney, R.: Online inference-rule learning from natural-language extractions. In: International Workshop on Statistical Relational AI (2013)
Google Scholar
Riedel, S., Chun, H., Takagi, T., Tsujii, J.: A Markov logic approach to bio-molecular event extraction. In: BioNLP (2009)
Google Scholar
Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part III. LNCS, vol. 6323, pp. 148–163. Springer, Heidelberg (2010)
Chapter Google Scholar
Sorower, S., Dietterich, T., Doppa, J., Orr, W., Tadepalli, P., Fern, X.: Inverting Grice’s maxims to learn rules from natural language extractions. In: NIPS, pp. 1053–1061 (2011)
Google Scholar
Surdeanu, M., Ciaramita, M.: Robust information extraction with perceptrons. In: NIST ACE (2007)
Google Scholar
Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.: Multi-instance multi-label learning for relation extraction. In: EMNLP-CoNLL (2012)
Google Scholar
Takamatsu, S., Sato, I., Nakagawa, H.: Reducing wrong labels in distant supervision for relation extraction. In: ACL (2012)
Google Scholar
Torrey, L., Shavlik, J., Walker, T., Maclin, R.: Transfer learning via advice taking. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds.) Advances in Machine Learning I. SCI, vol. 262, pp. 147–170. Springer, Heidelberg (2010)
Chapter Google Scholar
Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., Pustejovsky, J.: SemEval-2007 task 15: TempEval temporal relation identification. In: SemEval (2007)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)
Google Scholar
Yoshikawa, K., Riedel, S., Asahara, M., Matsumoto, Y.: Jointly identifying temporal relations with Markov logic. In: ACL and AFNLP (2009)
Google Scholar
Zhou, G., Su, J., Zhang, J., Zhang, M.: Exploring various knowledge in relation extraction. In: ACL (2005)
Google Scholar

Download references

Acknowledgements

Sriraam Natarajan, Anurag Wazalwar and Dileep Viswanathan gratefully acknowledge the support of the DARPA Machine Reading Program and DEFT Program under the Air Force Research Laboratory (AFRL) prime contract nos. FA8750-09-C-0181 and FA8750-13-2-0039 respectively. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the authors and do not necessarily reflect the view of the DARPA, AFRL, or the US government. Kristian Kersting was supported by the Fraunhofer ATTRACT fellowship STREAM and by the European Commission under contract number FP7-248258-First-MM.

Author information

Authors and Affiliations

Indiana University, Bloomington, USA
Sriraam Natarajan, Anurag Wazalwar & Dileep Viswanathan
Swarthmore College, Swarthmore, USA
Ameet Soni
TU Dortmund, Dortmund, Germany
Kristian Kersting

Authors

Sriraam Natarajan
View author publications
You can also search for this author in PubMed Google Scholar
Ameet Soni
View author publications
You can also search for this author in PubMed Google Scholar
Anurag Wazalwar
View author publications
You can also search for this author in PubMed Google Scholar
Dileep Viswanathan
View author publications
You can also search for this author in PubMed Google Scholar
Kristian Kersting
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sriraam Natarajan .

Editor information

Editors and Affiliations

TU Dortmund , Dortmund, Germany
Stefan Michaelis
TU Dortmund , Dortmund, Germany
Nico Piatkowski
TU Dortmund , Dortmund, Germany
Marco Stolpe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Natarajan, S., Soni, A., Wazalwar, A., Viswanathan, D., Kersting, K. (2016). Deep Distant Supervision: Learning Statistical Relational Models for Weak Supervision in Natural Language Extraction. In: Michaelis, S., Piatkowski, N., Stolpe, M. (eds) Solving Large Scale Learning Tasks. Challenges and Algorithms. Lecture Notes in Computer Science(), vol 9580. Springer, Cham. https://doi.org/10.1007/978-3-319-41706-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-41706-6_18
Published: 03 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41705-9
Online ISBN: 978-3-319-41706-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics