Abstract
Semantic interaction between text segments, which has been proven to be very useful for detecting the paraphrase relations, is often ignored in the study of paraphrase identification. In this paper, we adopt a neural network model for paraphrase identification, called as bidirectional Long Short-Term Memory-Gated Relevance Network (Bi-LSTM+GRN). According to this model, a gated relevance network is used to capture the semantic interaction between text segments, and then aggregated using a pooling layer to select the most informative interactions. Experiments on the Microsoft Research Paraphrase Corpus (MSRP) benchmark dataset show that this model achieves better performances than hand-crafted feature based approaches as well as previous neural network models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Marsi, E., Krahmer, E.: Explorations in sentence fusion. In: Proceedings of the European Workshop on Natural Language Generation, pp. 109–117. Citeseer (2005)
Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68, 13 (2013). Foundation of Computer Science
Clough, P., Gaizauskas, R., Piao, S.S., Wilks, Y.: Meter: measuring text reuse. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 152–159. Association for Computational Linguistics (2002)
Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J.: A Decomposable Attention Model for Natural Language Inference. arXiv preprint arXiv:1606.01933 (2016)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011). JMLR.org
Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and POS tagging. In: EMNLP, pp. 647–657 (2013)
Pei, W., Ge, T., Baobao, C.: Max margin tensor neural network for Chinese word segmentation. In: Proceedings of ACL (2014)
Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011)
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)
Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., Cheng, X.: A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations. arXiv preprint arXiv:1511.08277 (2015)
Yin, W., Schütze, H.: Convolutional neural network for paraphrase identification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 901–911 (2015)
He, H., Gimpel, K., Lin, J.: Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1576–1586 (2015)
Chen, J., Zhang, Q., Liu, P., Qiu, X., Huang, X.: Implicit discourse relation detection via a deep architecture with gated relevance network. In: Proceedings of ACL (2016)
Wan, S., Dras, M., Dale, R., Paris, C.: Using dependency-based features to take the “para-farce” out of paraphrase. In: Proceedings of the Australasian Language Technology Workshop, vol. 2006 (2006)
Madnani, N., Tetreault, J., Chodorow, M.: Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 182–190. Association for Computational Linguistics (2012)
Bu, F., Li, H., Zhu, X.: String re-writing kernel. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 449–458. Association for Computational Linguistics (2012)
Wu, D.: Recognizing paraphrases and textual entailment using inversion transduction grammars. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 25–30. Association for Computational Linguistics (2005)
Das, D., Smith, N.A.: Paraphrase identification as probabilistic quasi-synchronous recognition. In: Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, pp. 468–476. Association for Computational Linguistics (2009)
Hassan, S.: Measuring semantic relatedness using salient encyclopedic concepts. University of North Texas (2011)
Guo, W., Diab, M.: Modeling sentences in the latent space. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 864–872. Association for Computational Linguistics (2012)
Ji, Y., Eisenstein, J.: Discriminative improvements to distributional sentence similarity. In: EMNLP, pp. 891–896 (2013)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation, vol. 9, pp. 1735–1780. MIT Press (1997)
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, title=Word Representations: A Simple and General Method for Semi-supervised Learning, pp. 384–394. Association for Computational Linguistics (2010)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 5, 1 (1988)
Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990). Elsevier
Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Netw. 1, 339–356 (1998). Elsevier
Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 6, 107–116 (1998). World Scientific
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45, 2673–2681 (1997). IEEE
Sutskever, I., Tenenbaum, J.B., Salakhutdinov, R.R.: Modelling relational data using bayesian clustered tensor factorization. Advances in Neural Information Processing Systems, pp. 1821–1828 (2009)
Jenatton, R., Roux, N.L., Bordes, A., Obozinski, G.R.: A latent factor model for highly multi-relational data. Advances in Neural Information Processing Systems, pp. 3167–3175 (2012)
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)
Dolan, B., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 350. Association for Computational Linguistics (2004)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Shen, Y., Chen, J., Huang, X. (2016). Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-50496-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50495-7
Online ISBN: 978-3-319-50496-4
eBook Packages: Computer ScienceComputer Science (R0)