Skip to main content

Event Extraction with Deep Contextualized Word Representation and Multi-attention Layer

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11323))

Abstract

One common application of text mining is event extraction. The purpose of an event extraction task is to identify event triggers of a certain event type in the text and to find related arguments. In recent years, the technology to automatically extract events from text has drawn researchers’ attention. However, the existing works including feature based systems and neural network base models don’t capture the contextual information well. Besides, it is still difficult to extract deep semantic relations when finding related arguments for events. To address these issues, we propose a novel model for event extraction using multi-attention layers and deep contextualized word representation. Furthermore, we put forward an attention function suitable for event extraction tasks. Experimental results show that our model outperforms the state-of-the-art models on ACE2005.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Li, Q., Ji, H., Huang, L.: Joint event extraction via structured prediction with global features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 73–82 (2013)

    Google Scholar 

  2. Li, Q., Ji, H., Hong, Y., Li, S.: Constructing information networks using one single model. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1846–1851 (2014)

    Google Scholar 

  3. Hong, Y., Zhang, J., Ma, B., Yao, J., Zhou, G., Zhu, Q.: Using cross-entity inference to improve event extraction. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1127–1136. Association for Computational Linguistics, June 2011

    Google Scholar 

  4. Liao, S., Grishman, R.: Using document level cross-event inference to improve event extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 789–797. Association for Computational Linguistics, July 2010

    Google Scholar 

  5. Ji, H., Grishman, R.: Refining event extraction through cross-document inference. In: Proceedings of ACL 2008: HLT, pp. 254–262 (2008)

    Google Scholar 

  6. Chen, Y., Xu, L., Liu, K., Zeng, D., Zhao, J.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 167–176 (2015)

    Google Scholar 

  7. Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 300–309 (2016)

    Google Scholar 

  8. Tozzo, A., Jovanovic, D., Amer, M.: Neural event extraction from movies description. In: Proceedings of the First Workshop on Storytelling, pp. 60–66 (2018)

    Google Scholar 

  9. Larson, R.R.: Introduction to information retrieval. J. Am. Soc. Inf. Sci. Technol. 61(4), 852–853 (2010)

    Google Scholar 

  10. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)

    Article  MathSciNet  Google Scholar 

  11. Tellex, S., Katz, B., Lin, J., Fernandes, A., Marton, G.: Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 41–47. ACM, July 2003

    Google Scholar 

  12. Socher, R., Bauer, J., Manning, C.D.: Parsing with compositional vector grammars. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 455–465 (2013)

    Google Scholar 

  13. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  14. Peters, M.E., et al.: Deep contextualized word representations. arXiv.org. arXiv:1802.05365 (2018)

  15. Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: 1996 IEEE International Conference on Neural Networks, pp. 347–352. IEEE, June 1996

    Google Scholar 

  16. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  17. Mikolov, T., Kombrink, S., Burget, L., Černocký, J., Khudanpur, S.: Extensions of recurrent neural network language model. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5528–5531. IEEE, May 2011

    Google Scholar 

  18. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv.org. arXiv:1406.1078 (2014)

  19. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, speech and signal processing (ICASSP), pp. 6645–6649. IEEE, May 2013

    Google Scholar 

  20. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv.org. arXiv:1412.3555 (2014)

  21. Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv.org. arXiv:1602.02410 (2016)

  22. Melis, G., Dyer, C., Blunsom, P.: On the state of the art of evaluation in neural language models. arXiv.org. arXiv:1707.05589 (2017)

  23. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)

    MATH  Google Scholar 

  24. Kim, Y.: Convolutional neural networks for sentence classification. arXiv.org. arXiv:1408.5882 (2014)

  25. Zeng, M., et al.: Convolutional neural networks for human activity recognition using mobile sensors. In: 2014 6th International Conference on Mobile Computing, Applications and Services (MobiCASE), pp. 197–205. IEEE, November 2014

    Google Scholar 

  26. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv.org. arXiv:1409.0473 (2014)

  27. Chorowski, J.K., Bahdanau, D., Serdyuk, D., Cho, K., Bengio, Y.: Attention-based models for speech recognition. In: Advances in Neural Information Processing Systems, pp. 577–585 (2015)

    Google Scholar 

  28. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)

    Google Scholar 

  29. Nguyen, T.H., Grishman, R.: Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 365–371 (2015)

    Google Scholar 

  30. Grishman, R., Westbrook, D., Meyers, A.: NYU’s English ACE 2005 system description, vol. 5. ACE, (2005)

    Google Scholar 

  31. Ahn, D.: The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning about Time and Events, pp. 1–8. Association for Computational Linguistics, July 2006

    Google Scholar 

  32. Gupta, P., Ji, H.: Predicting unknown time arguments based on cross-event propagation. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 369–372. Association for Computational Linguistics, August 2009

    Google Scholar 

  33. Patwardhan, S., Riloff, E.: A unified model of phrasal and sentential evidence for information extraction. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 151–160. Association for Computational Linguistics, August 2009

    Google Scholar 

  34. Liao, S., Grishman, R.: Acquiring topic features to improve event extraction: in pre-selected and balanced collections. In: Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, pp. 9–16 (2011)

    Google Scholar 

  35. McClosky, D., Surdeanu, M., Manning, C.D.: Event extraction as dependency parsing. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1626–1635. Association for Computational Linguistics, June 2011

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (Grand Nos. U1636211, 61672081, 61370126), Beijing Advanced Innovation Center for Imaging Technology (No. BAICIT-2016001), and the National Key R&D Program of China under Grant 2016QY04W0802.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ruixue Ding or Zhoujun Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ding, R., Li, Z. (2018). Event Extraction with Deep Contextualized Word Representation and Multi-attention Layer. In: Gan, G., Li, B., Li, X., Wang, S. (eds) Advanced Data Mining and Applications. ADMA 2018. Lecture Notes in Computer Science(), vol 11323. Springer, Cham. https://doi.org/10.1007/978-3-030-05090-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05090-0_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05089-4

  • Online ISBN: 978-3-030-05090-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics