Skip to main content

Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification

  • Conference paper
  • First Online:
Book cover Natural Language Understanding and Intelligent Applications (ICCPOL 2016, NLPCC 2016)

Abstract

Semantic interaction between text segments, which has been proven to be very useful for detecting the paraphrase relations, is often ignored in the study of paraphrase identification. In this paper, we adopt a neural network model for paraphrase identification, called as bidirectional Long Short-Term Memory-Gated Relevance Network (Bi-LSTM+GRN). According to this model, a gated relevance network is used to capture the semantic interaction between text segments, and then aggregated using a pooling layer to select the most informative interactions. Experiments on the Microsoft Research Paraphrase Corpus (MSRP) benchmark dataset show that this model achieves better performances than hand-crafted feature based approaches as well as previous neural network models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Marsi, E., Krahmer, E.: Explorations in sentence fusion. In: Proceedings of the European Workshop on Natural Language Generation, pp. 109–117. Citeseer (2005)

    Google Scholar 

  2. Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68, 13 (2013). Foundation of Computer Science

    Google Scholar 

  3. Clough, P., Gaizauskas, R., Piao, S.S., Wilks, Y.: Meter: measuring text reuse. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 152–159. Association for Computational Linguistics (2002)

    Google Scholar 

  4. Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J.: A Decomposable Attention Model for Natural Language Inference. arXiv preprint arXiv:1606.01933 (2016)

  5. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011). JMLR.org

    MATH  Google Scholar 

  6. Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and POS tagging. In: EMNLP, pp. 647–657 (2013)

    Google Scholar 

  7. Pei, W., Ge, T., Baobao, C.: Max margin tensor neural network for Chinese word segmentation. In: Proceedings of ACL (2014)

    Google Scholar 

  8. Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011)

    Google Scholar 

  9. Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)

    Google Scholar 

  10. Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., Cheng, X.: A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations. arXiv preprint arXiv:1511.08277 (2015)

  11. Yin, W., Schütze, H.: Convolutional neural network for paraphrase identification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 901–911 (2015)

    Google Scholar 

  12. He, H., Gimpel, K., Lin, J.: Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1576–1586 (2015)

    Google Scholar 

  13. Chen, J., Zhang, Q., Liu, P., Qiu, X., Huang, X.: Implicit discourse relation detection via a deep architecture with gated relevance network. In: Proceedings of ACL (2016)

    Google Scholar 

  14. Wan, S., Dras, M., Dale, R., Paris, C.: Using dependency-based features to take the “para-farce” out of paraphrase. In: Proceedings of the Australasian Language Technology Workshop, vol. 2006 (2006)

    Google Scholar 

  15. Madnani, N., Tetreault, J., Chodorow, M.: Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 182–190. Association for Computational Linguistics (2012)

    Google Scholar 

  16. Bu, F., Li, H., Zhu, X.: String re-writing kernel. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 449–458. Association for Computational Linguistics (2012)

    Google Scholar 

  17. Wu, D.: Recognizing paraphrases and textual entailment using inversion transduction grammars. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 25–30. Association for Computational Linguistics (2005)

    Google Scholar 

  18. Das, D., Smith, N.A.: Paraphrase identification as probabilistic quasi-synchronous recognition. In: Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, pp. 468–476. Association for Computational Linguistics (2009)

    Google Scholar 

  19. Hassan, S.: Measuring semantic relatedness using salient encyclopedic concepts. University of North Texas (2011)

    Google Scholar 

  20. Guo, W., Diab, M.: Modeling sentences in the latent space. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 864–872. Association for Computational Linguistics (2012)

    Google Scholar 

  21. Ji, Y., Eisenstein, J.: Discriminative improvements to distributional sentence similarity. In: EMNLP, pp. 891–896 (2013)

    Google Scholar 

  22. Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)

  23. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation, vol. 9, pp. 1735–1780. MIT Press (1997)

    Google Scholar 

  24. Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, title=Word Representations: A Simple and General Method for Semi-supervised Learning, pp. 384–394. Association for Computational Linguistics (2010)

    Google Scholar 

  25. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  26. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 5, 1 (1988)

    Google Scholar 

  27. Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990). Elsevier

    Article  Google Scholar 

  28. Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Netw. 1, 339–356 (1998). Elsevier

    Article  Google Scholar 

  29. Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 6, 107–116 (1998). World Scientific

    Article  MATH  Google Scholar 

  30. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  31. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45, 2673–2681 (1997). IEEE

    Article  Google Scholar 

  32. Sutskever, I., Tenenbaum, J.B., Salakhutdinov, R.R.: Modelling relational data using bayesian clustered tensor factorization. Advances in Neural Information Processing Systems, pp. 1821–1828 (2009)

    Google Scholar 

  33. Jenatton, R., Roux, N.L., Bordes, A., Obozinski, G.R.: A latent factor model for highly multi-relational data. Advances in Neural Information Processing Systems, pp. 3167–3175 (2012)

    Google Scholar 

  34. Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)

    Google Scholar 

  35. Dolan, B., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 350. Association for Computational Linguistics (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yatian Shen , Jifan Chen or Xuanjing Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Shen, Y., Chen, J., Huang, X. (2016). Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50496-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50495-7

  • Online ISBN: 978-3-319-50496-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics