Abstract
Analysts engaged in monitoring and forecasting benefit from the structured representations of domain knowledge and societal events that allow for the use of advanced analytics and predictive data models over large amounts of temporally extended data. However, extracting structured data from unstructured data typically requires the development of domain specific software which is costly, takes months to years to create, and cannot adapt to changing domains. In this paper we consider the operational usefulness of an approach pioneered by Chambers and Jurafsky (Template-based information extraction without the templates, 2011, [1]) that performs automatic learning of structured domain knowledge in the form of event templates from unstructured text that are used to automatically extract structured events from text. We generalize this approach and apply it to operationally relevant corpora from Brazil, Mexico, Ukraine, and Pakistan that focus on societal protests and providing aid. We discover that we are able to generate compelling event templates that correspond to event types described by Conflict and Mediation Event Observations (CAMEO) codes (Retrieved from Computational Event Data System, 2014, [2]) which are used to label event types by existing state of the art systems. Additionally, we are able to learn event templates that capture more nuance than the CAMEO codes represent, as well as entirely new and interesting event types. To automate our experimentation, we describe novel automated metrics that allow us to batch run multiple experiments while getting automated feedback on the quality of results from each run. These metrics indicate significant overlap between the events we extract and those extracted by existing systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chambers, N., Jurafsky, D.: Template-based information extraction without the templates. In: Proceedings of the Association for Computational Linguistics (ACL) (2011)
CAMEO Event Data Cookbook. Retrieved from Computational Event Data System. http://eventdata.parusanalytics.com/data.dir/cameo.html. Accessed 21 June 2014
Rau, L., Krupka, G., Jacobs, P., Sider, I., Childs, L.: Ge nltoolset: Muc-4 test results and analysis. In: Proceedings of the Message Understanding Conference (MUC-4), pp. 94–99 (1992)
Chinchor, N., Lewis, D., Hirschman, L.: Evaluating message under-standing systems: an analysis of the third message understanding conference. Comput. Linguist. 19(3), 409–449 (1993)
Freitag, D.: Toward general-purpose learning for information extraction. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 404–408 (1998)
Chieu, H.L., Ng, H.T., Lee, Y.K.: Closing the gap: learning-based information extraction rivaling knowledge-engineering methods. In: Proceedings of the Association for Computational Linguistics (ACL) (2003)
Bunescu, R., Mooney, R.: Collective information extraction with relational markov networks. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 438–445 (2004)
Patwardhan, S., Riloff, E.: A unified model of phrasal and sentential evidence for information extraction. In: Proceedings of the Conference on Empirical Meth-ods on Natural Language Processing (EMNLP) (2009)
Huang, R., Riloff, E.: Peeling back the layers: detecting event role fillers in secondary contexts. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL) (2011)
Riloff, E., Schmelzenbach, M.: An empirical approach to conceptual case frame acquisition. In: Proceedings of the Sixth Workshop on Very Large Corpora (1998)
Sudo, K., Sekine, S., Grishman, R.: An improved extraction pattern representation model for automatic ie pattern acquisition. In: Proceedings of the Association for Computational Linguistics (ACL), pp. 224–231 (2003)
Riloff, E., Wiebe, J., Phillips, W.: Exploiting subjectivity classification to improve information extraction. In: Proceedings of AAAI-05 (2005)
Filatova, E., Hatzivassiloglou, V., McKeown, K.: Automatic creation of domain templates. In: Proceedings of the Association for Computational Linguistics (ACL) (2006)
Patwardhan, S., Riloff, E.: Effective ie with semantic affinity patterns and relevant regions. In: Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP) (2007)
Chen, H., Benson, E., Naseem, T., Barzilay, R.: In-domain relation discovery with meta-constraints via posterior regularization. In: Proceedings of the Association for Computational Linguistics (ACL) (2011)
Chambers, N., Jurafsky, D.: Unsupervised learning of narrative event chains. In: Proceedings of the Association of Computational Linguistics (ACL), Hawaii, USA (2008)
Chambers, N., Jurafsky, D.: Unsupervised learning of narrative schemas and their participants. In: Proceedings of the Association of Computational Linguistics (ACL), Columbus, Ohio (2009)
Cheung, J.C.K., Poon, H., Vanderwende, L.: Probabilistic frame induction. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2013)
Chambers, N.: Event Schema Induction with a Probabilistic Entity-Driven Model. EMNLP-2013, Seattle, WA (2013)
Nguyen, K.-H., Tannier, X., Ferret, O., Besançon, R.: Generative Event Schema Induction with Entity Disambiguation. ACL (2015)
Balasubramanian, N., Soderland, S., Mausam, Etzioni, O.: Generating Coherent Event Schemas at Scale. EMNLP (2013)
Jans, B., Vulic, I., Moens, M.F.: Skip N-grams and Ranking Functions for Predicting Script Events. EACL (2012)
Pichotta, K., Mooney, R.J.: Statistical Script Learning with Multi-Argument Events. EACL (2014)
Rudinger, R., Rastogi, P., Ferraro, F., Van Durme, B.: Script Induction as Language Modeling. EMNLP (2015)
Erk, K., Pado, S.: Exemplar-based models for word meaning in context. In: Proceedings of ACL, pp. 92–97 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this paper
Cite this paper
Schlachter, J., Van Brackle, D., Reynoso, L.A., Starz, J., Chambers, N. (2017). Evaluating Automatic Learning of Structure for Event Extraction. In: Schatz, S., Hoffman, M. (eds) Advances in Cross-Cultural Decision Making. Advances in Intelligent Systems and Computing, vol 480. Springer, Cham. https://doi.org/10.1007/978-3-319-41636-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-41636-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41635-9
Online ISBN: 978-3-319-41636-6
eBook Packages: EngineeringEngineering (R0)