Advertisement

Event Extraction with Deep Contextualized Word Representation and Multi-attention Layer

  • Ruixue DingEmail author
  • Zhoujun LiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11323)

Abstract

One common application of text mining is event extraction. The purpose of an event extraction task is to identify event triggers of a certain event type in the text and to find related arguments. In recent years, the technology to automatically extract events from text has drawn researchers’ attention. However, the existing works including feature based systems and neural network base models don’t capture the contextual information well. Besides, it is still difficult to extract deep semantic relations when finding related arguments for events. To address these issues, we propose a novel model for event extraction using multi-attention layers and deep contextualized word representation. Furthermore, we put forward an attention function suitable for event extraction tasks. Experimental results show that our model outperforms the state-of-the-art models on ACE2005.

Keywords

Event extraction Muti-attention layer Deep contextualized word representation 

Notes

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (Grand Nos. U1636211, 61672081, 61370126), Beijing Advanced Innovation Center for Imaging Technology (No. BAICIT-2016001), and the National Key R&D Program of China under Grant 2016QY04W0802.

References

  1. 1.
    Li, Q., Ji, H., Huang, L.: Joint event extraction via structured prediction with global features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 73–82 (2013)Google Scholar
  2. 2.
    Li, Q., Ji, H., Hong, Y., Li, S.: Constructing information networks using one single model. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1846–1851 (2014)Google Scholar
  3. 3.
    Hong, Y., Zhang, J., Ma, B., Yao, J., Zhou, G., Zhu, Q.: Using cross-entity inference to improve event extraction. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1127–1136. Association for Computational Linguistics, June 2011Google Scholar
  4. 4.
    Liao, S., Grishman, R.: Using document level cross-event inference to improve event extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 789–797. Association for Computational Linguistics, July 2010Google Scholar
  5. 5.
    Ji, H., Grishman, R.: Refining event extraction through cross-document inference. In: Proceedings of ACL 2008: HLT, pp. 254–262 (2008)Google Scholar
  6. 6.
    Chen, Y., Xu, L., Liu, K., Zeng, D., Zhao, J.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 167–176 (2015)Google Scholar
  7. 7.
    Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 300–309 (2016)Google Scholar
  8. 8.
    Tozzo, A., Jovanovic, D., Amer, M.: Neural event extraction from movies description. In: Proceedings of the First Workshop on Storytelling, pp. 60–66 (2018)Google Scholar
  9. 9.
    Larson, R.R.: Introduction to information retrieval. J. Am. Soc. Inf. Sci. Technol. 61(4), 852–853 (2010)Google Scholar
  10. 10.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Tellex, S., Katz, B., Lin, J., Fernandes, A., Marton, G.: Quantitative evaluation of passage retrieval algorithms for question answering. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 41–47. ACM, July 2003Google Scholar
  12. 12.
    Socher, R., Bauer, J., Manning, C.D.: Parsing with compositional vector grammars. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 455–465 (2013)Google Scholar
  13. 13.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  14. 14.
    Peters, M.E., et al.: Deep contextualized word representations. arXiv.org. arXiv:1802.05365 (2018)
  15. 15.
    Goller, C., Kuchler, A.: Learning task-dependent distributed representations by backpropagation through structure. In: 1996 IEEE International Conference on Neural Networks, pp. 347–352. IEEE, June 1996Google Scholar
  16. 16.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  17. 17.
    Mikolov, T., Kombrink, S., Burget, L., Černocký, J., Khudanpur, S.: Extensions of recurrent neural network language model. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5528–5531. IEEE, May 2011Google Scholar
  18. 18.
    Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv.org. arXiv:1406.1078 (2014)
  19. 19.
    Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, speech and signal processing (ICASSP), pp. 6645–6649. IEEE, May 2013Google Scholar
  20. 20.
    Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv.org. arXiv:1412.3555 (2014)
  21. 21.
    Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N., Wu, Y.: Exploring the limits of language modeling. arXiv.org. arXiv:1602.02410 (2016)
  22. 22.
    Melis, G., Dyer, C., Blunsom, P.: On the state of the art of evaluation in neural language models. arXiv.org. arXiv:1707.05589 (2017)
  23. 23.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)zbMATHGoogle Scholar
  24. 24.
    Kim, Y.: Convolutional neural networks for sentence classification. arXiv.org. arXiv:1408.5882 (2014)
  25. 25.
    Zeng, M., et al.: Convolutional neural networks for human activity recognition using mobile sensors. In: 2014 6th International Conference on Mobile Computing, Applications and Services (MobiCASE), pp. 197–205. IEEE, November 2014Google Scholar
  26. 26.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv.org. arXiv:1409.0473 (2014)
  27. 27.
    Chorowski, J.K., Bahdanau, D., Serdyuk, D., Cho, K., Bengio, Y.: Attention-based models for speech recognition. In: Advances in Neural Information Processing Systems, pp. 577–585 (2015)Google Scholar
  28. 28.
    Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 6000–6010 (2017)Google Scholar
  29. 29.
    Nguyen, T.H., Grishman, R.: Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 365–371 (2015)Google Scholar
  30. 30.
    Grishman, R., Westbrook, D., Meyers, A.: NYU’s English ACE 2005 system description, vol. 5. ACE, (2005)Google Scholar
  31. 31.
    Ahn, D.: The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning about Time and Events, pp. 1–8. Association for Computational Linguistics, July 2006Google Scholar
  32. 32.
    Gupta, P., Ji, H.: Predicting unknown time arguments based on cross-event propagation. In: Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pp. 369–372. Association for Computational Linguistics, August 2009Google Scholar
  33. 33.
    Patwardhan, S., Riloff, E.: A unified model of phrasal and sentential evidence for information extraction. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 151–160. Association for Computational Linguistics, August 2009Google Scholar
  34. 34.
    Liao, S., Grishman, R.: Acquiring topic features to improve event extraction: in pre-selected and balanced collections. In: Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, pp. 9–16 (2011)Google Scholar
  35. 35.
    McClosky, D., Surdeanu, M., Manning, C.D.: Event extraction as dependency parsing. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1626–1635. Association for Computational Linguistics, June 2011Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringBeihang UniversityBeijingChina

Personalised recommendations