Abstract
Compared to conventional methods, recurrent neural networks and corresponding variants have been proved to be more effective in relation extraction tasks. In this paper, we propose a model that combines a bidirectional long short-term memory network with a multi-attention mechanism for relation extraction. We designed a bidirectional attention mechanism to extract word-level features from a single sentence and chose a sentence-level attention mechanism to focus on features of a sentence set. Our experiments were conducted on a public dataset to evaluate the performance of the model. The experimental results demonstrate that the multi-attention mechanism can make full use of all informative features of a single sentence and a sentence set and our model achieves state-of-the-art performance.
References
Aggarwal, C.C., Zhai, C. (eds.): Mining Text Data. Springer Science & Business Media, Berlin (2012)
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing Associations, vol. 2, pp. 1003–1011 (2009)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Shang, L., Lu, Z., Li, H.: Neural responding machine for short-text conversation. arXiv preprint arXiv:1503.02364 (2015)
Zhou, P., Shi, W., Tian, J., Qi, Z., Li, B., Hao, H., Xu, B.: Attention-based bidirectional long short-term memory networks for relation classification. In: The 54th Annual Meeting of the Association for Computational Linguistics, p. 207 (2016)
Lin, Y., Shen, S., Liu, Z., Luan, H., Sun, M.: Neural relation extraction with selective attention over instances. In: The 54th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 2124–2133 (2016)
Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: International Conference on Computational Linguistics, pp. 2335–2344 (2014)
Nguyen, T.H., Grishman, R.: Relation extraction: perspective from convolutional neural networks. In: Conference of the North American Chapter of the Association for Computational Linguistics\(-\)Human Language Technologies, pp. 39–48 (2015)
Zeng, D., Liu, K., Chen, Y., Zhao, J.: Distant supervision for relation extraction via piecewise convolutional neural networks. In: Conference on Empirical Methods in Natural Language Processing, pp. 1753–1762 (2015)
Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: The 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211. Association for Computational Linguistics (2012)
Hashimoto, K., Miwa, M., Tsuruoka, Y., Chikayama, T.: Simple customization of recursive neural networks for semantic relation classification. In: Conference on Empirical Methods in Natural Language Processing, pp. 1372–1376 (2013)
Zhang, D., Wang, D.: Relation classification via recurrent neural network. arXiv preprint arXiv:1508.01006 (2015)
Xu, Y., Mou, L., Li, G., Chen, Y., Peng, H., Jin, Z.: Classifying relations via long short term memory networks along shortest dependency paths. In: Conference on Empirical Methods in Natural Language Processing, pp. 1785–1794 (2015)
Nie, Y., An, C., Huang, J., Yan, Z., Han, Y.: A bidirectional LSTM model for question title and body analysis in question answering. In: Data Science in Cyberspace, pp. 307–311. IEEE Press (2016)
Nie, Y.P., Han, Y., Huang, J.M., Jiao, B., Li, A.P.: Attention-based encoder-decoder model for answer selection in question answering. Frontiers Inf. Technol. Electron. Eng. 18(4), 535–544 (2017)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, vol. 14, pp. 1532–1543 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6323, pp. 148–163. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15939-8_10
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Fundel, K., Kner, R., Zimmer, R.: RelEx\(-\)Relation extraction using dependency parse trees. Bioinformatics 23(3), 365–371 (2007)
Acknowlegement
This research was supported by the National Natural Science Foundation of China (NSFC) under the project Nos. 61502517, 61672020 and 61662069.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Li, L., Nie, Y., Han, W., Huang, J. (2017). A Multi-attention-Based Bidirectional Long Short-Term Memory Network for Relation Extraction. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10638. Springer, Cham. https://doi.org/10.1007/978-3-319-70139-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-70139-4_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70138-7
Online ISBN: 978-3-319-70139-4
eBook Packages: Computer ScienceComputer Science (R0)