Evaluating Automatic Learning of Structure for Event Extraction

Schlachter, Jason; Van Brackle, David; Reynoso, Luis Asencios; Starz, James; Chambers, Nathanael

doi:10.1007/978-3-319-41636-6_12

Jason Schlachter⁴,
David Van Brackle⁴,
Luis Asencios Reynoso⁴,
James Starz⁴ &
…
Nathanael Chambers⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 480))

1064 Accesses

Abstract

Analysts engaged in monitoring and forecasting benefit from the structured representations of domain knowledge and societal events that allow for the use of advanced analytics and predictive data models over large amounts of temporally extended data. However, extracting structured data from unstructured data typically requires the development of domain specific software which is costly, takes months to years to create, and cannot adapt to changing domains. In this paper we consider the operational usefulness of an approach pioneered by Chambers and Jurafsky (Template-based information extraction without the templates, 2011, [1]) that performs automatic learning of structured domain knowledge in the form of event templates from unstructured text that are used to automatically extract structured events from text. We generalize this approach and apply it to operationally relevant corpora from Brazil, Mexico, Ukraine, and Pakistan that focus on societal protests and providing aid. We discover that we are able to generate compelling event templates that correspond to event types described by Conflict and Mediation Event Observations (CAMEO) codes (Retrieved from Computational Event Data System, 2014, [2]) which are used to label event types by existing state of the art systems. Additionally, we are able to learn event templates that capture more nuance than the CAMEO codes represent, as well as entirely new and interesting event types. To automate our experimentation, we describe novel automated metrics that allow us to batch run multiple experiments while getting automated feedback on the quality of results from each run. These metrics indicate significant overlap between the events we extract and those extracted by existing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chambers, N., Jurafsky, D.: Template-based information extraction without the templates. In: Proceedings of the Association for Computational Linguistics (ACL) (2011)
Google Scholar
CAMEO Event Data Cookbook. Retrieved from Computational Event Data System. http://eventdata.parusanalytics.com/data.dir/cameo.html. Accessed 21 June 2014
Rau, L., Krupka, G., Jacobs, P., Sider, I., Childs, L.: Ge nltoolset: Muc-4 test results and analysis. In: Proceedings of the Message Understanding Conference (MUC-4), pp. 94–99 (1992)
Google Scholar
Chinchor, N., Lewis, D., Hirschman, L.: Evaluating message under-standing systems: an analysis of the third message understanding conference. Comput. Linguist. 19(3), 409–449 (1993)
Google Scholar
Freitag, D.: Toward general-purpose learning for information extraction. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 404–408 (1998)
Google Scholar
Chieu, H.L., Ng, H.T., Lee, Y.K.: Closing the gap: learning-based information extraction rivaling knowledge-engineering methods. In: Proceedings of the Association for Computational Linguistics (ACL) (2003)
Google Scholar
Bunescu, R., Mooney, R.: Collective information extraction with relational markov networks. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 438–445 (2004)
Google Scholar
Patwardhan, S., Riloff, E.: A unified model of phrasal and sentential evidence for information extraction. In: Proceedings of the Conference on Empirical Meth-ods on Natural Language Processing (EMNLP) (2009)
Google Scholar
Huang, R., Riloff, E.: Peeling back the layers: detecting event role fillers in secondary contexts. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL) (2011)
Google Scholar
Riloff, E., Schmelzenbach, M.: An empirical approach to conceptual case frame acquisition. In: Proceedings of the Sixth Workshop on Very Large Corpora (1998)
Google Scholar
Sudo, K., Sekine, S., Grishman, R.: An improved extraction pattern representation model for automatic ie pattern acquisition. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 224–231 (2003)
Google Scholar
Riloff, E., Wiebe, J., Phillips, W.: Exploiting subjectivity classification to improve information extraction. In: Proceedings of AAAI-05 (2005)
Google Scholar
Filatova, E., Hatzivassiloglou, V., McKeown, K.: Automatic creation of domain templates. In: Proceedings of the Association for Computational Linguistics (ACL) (2006)
Google Scholar
Patwardhan, S., Riloff, E.: Effective ie with semantic affinity patterns and relevant regions. In: Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP) (2007)
Google Scholar
Chen, H., Benson, E., Naseem, T., Barzilay, R.: In-domain relation discovery with meta-constraints via posterior regularization. In: Proceedings of the Association for Computational Linguistics (ACL) (2011)
Google Scholar
Chambers, N., Jurafsky, D.: Unsupervised learning of narrative event chains. In: Proceedings of the Association of Computational Linguistics (ACL), Hawaii, USA (2008)
Google Scholar
Chambers, N., Jurafsky, D.: Unsupervised learning of narrative schemas and their participants. In: Proceedings of the Association of Computational Linguistics (ACL), Columbus, Ohio (2009)
Google Scholar
Cheung, J.C.K., Poon, H., Vanderwende, L.: Probabilistic frame induction. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2013)
Google Scholar
Chambers, N.: Event Schema Induction with a Probabilistic Entity-Driven Model. EMNLP-2013, Seattle, WA (2013)
Google Scholar
Nguyen, K.-H., Tannier, X., Ferret, O., Besançon, R.: Generative Event Schema Induction with Entity Disambiguation. ACL (2015)
Google Scholar
Balasubramanian, N., Soderland, S., Mausam, Etzioni, O.: Generating Coherent Event Schemas at Scale. EMNLP (2013)
Google Scholar
Jans, B., Vulic, I., Moens, M.F.: Skip N-grams and Ranking Functions for Predicting Script Events. EACL (2012)
Google Scholar
Pichotta, K., Mooney, R.J.: Statistical Script Learning with Multi-Argument Events. EACL (2014)
Google Scholar
Rudinger, R., Rastogi, P., Ferraro, F., Van Durme, B.: Script Induction as Language Modeling. EMNLP (2015)
Google Scholar
Erk, K., Pado, S.: Exemplar-based models for word meaning in context. In: Proceedings of ACL, pp. 92–97 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Informatics Laboratory, Lockheed Martin Advanced Technology Laboratories, Kennesaw, GA, USA
Jason Schlachter, David Van Brackle, Luis Asencios Reynoso & James Starz
Department of Computer Science, United States Naval Academy, Annapolis, MD, USA
Nathanael Chambers

Authors

Jason Schlachter
View author publications
You can also search for this author in PubMed Google Scholar
David Van Brackle
View author publications
You can also search for this author in PubMed Google Scholar
Luis Asencios Reynoso
View author publications
You can also search for this author in PubMed Google Scholar
James Starz
View author publications
You can also search for this author in PubMed Google Scholar
Nathanael Chambers
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason Schlachter .

Editor information

Editors and Affiliations

AdvancedDistributedLearning (ADL) Initia, Arlington, VA, USA
Sae Schatz
LockheedMartin'sAdvancedTechnologyLabs, Orlando, FL, USA
Mark Hoffman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schlachter, J., Van Brackle, D., Reynoso, L.A., Starz, J., Chambers, N. (2017). Evaluating Automatic Learning of Structure for Event Extraction. In: Schatz, S., Hoffman, M. (eds) Advances in Cross-Cultural Decision Making. Advances in Intelligent Systems and Computing, vol 480. Springer, Cham. https://doi.org/10.1007/978-3-319-41636-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-41636-6_12
Published: 02 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41635-9
Online ISBN: 978-3-319-41636-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics