Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification

Shen, Yatian; Chen, Jifan; Huang, Xuanjing

doi:10.1007/978-3-319-50496-4_4

Yatian Shen¹⁸,
Jifan Chen¹⁸ &
Xuanjing Huang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10102))

Included in the following conference series:

4786 Accesses
1 Citations

Abstract

Semantic interaction between text segments, which has been proven to be very useful for detecting the paraphrase relations, is often ignored in the study of paraphrase identification. In this paper, we adopt a neural network model for paraphrase identification, called as bidirectional Long Short-Term Memory-Gated Relevance Network (Bi-LSTM+GRN). According to this model, a gated relevance network is used to capture the semantic interaction between text segments, and then aggregated using a pooling layer to select the most informative interactions. Experiments on the Microsoft Research Paraphrase Corpus (MSRP) benchmark dataset show that this model achieves better performances than hand-crafted feature based approaches as well as previous neural network models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Paraphrase Identification Based on Weighted URAE, Unit Similarity and Context Correlation Feature

Paraphrase detection using LSTM networks and handcrafted features

Article 16 October 2020

A Deep Network Model for Paraphrase Detection in Punjabi

References

Marsi, E., Krahmer, E.: Explorations in sentence fusion. In: Proceedings of the European Workshop on Natural Language Generation, pp. 109–117. Citeseer (2005)
Google Scholar
Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68, 13 (2013). Foundation of Computer Science
Google Scholar
Clough, P., Gaizauskas, R., Piao, S.S., Wilks, Y.: Meter: measuring text reuse. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 152–159. Association for Computational Linguistics (2002)
Google Scholar
Parikh, A.P., Täckström, O., Das, D., Uszkoreit, J.: A Decomposable Attention Model for Natural Language Inference. arXiv preprint arXiv:1606.01933 (2016)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011). JMLR.org
MATH Google Scholar
Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and POS tagging. In: EMNLP, pp. 647–657 (2013)
Google Scholar
Pei, W., Ge, T., Baobao, C.: Max margin tensor neural network for Chinese word segmentation. In: Proceedings of ACL (2014)
Google Scholar
Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011)
Google Scholar
Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems, pp. 2042–2050 (2014)
Google Scholar
Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., Cheng, X.: A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations. arXiv preprint arXiv:1511.08277 (2015)
Yin, W., Schütze, H.: Convolutional neural network for paraphrase identification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 901–911 (2015)
Google Scholar
He, H., Gimpel, K., Lin, J.: Multi-perspective sentence similarity modeling with convolutional neural networks. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1576–1586 (2015)
Google Scholar
Chen, J., Zhang, Q., Liu, P., Qiu, X., Huang, X.: Implicit discourse relation detection via a deep architecture with gated relevance network. In: Proceedings of ACL (2016)
Google Scholar
Wan, S., Dras, M., Dale, R., Paris, C.: Using dependency-based features to take the “para-farce” out of paraphrase. In: Proceedings of the Australasian Language Technology Workshop, vol. 2006 (2006)
Google Scholar
Madnani, N., Tetreault, J., Chodorow, M.: Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 182–190. Association for Computational Linguistics (2012)
Google Scholar
Bu, F., Li, H., Zhu, X.: String re-writing kernel. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 449–458. Association for Computational Linguistics (2012)
Google Scholar
Wu, D.: Recognizing paraphrases and textual entailment using inversion transduction grammars. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 25–30. Association for Computational Linguistics (2005)
Google Scholar
Das, D., Smith, N.A.: Paraphrase identification as probabilistic quasi-synchronous recognition. In: Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, pp. 468–476. Association for Computational Linguistics (2009)
Google Scholar
Hassan, S.: Measuring semantic relatedness using salient encyclopedic concepts. University of North Texas (2011)
Google Scholar
Guo, W., Diab, M.: Modeling sentences in the latent space. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 864–872. Association for Computational Linguistics (2012)
Google Scholar
Ji, Y., Eisenstein, J.: Discriminative improvements to distributional sentence similarity. In: EMNLP, pp. 891–896 (2013)
Google Scholar
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation, vol. 9, pp. 1735–1780. MIT Press (1997)
Google Scholar
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, title=Word Representations: A Simple and General Method for Semi-supervised Learning, pp. 384–394. Association for Computational Linguistics (2010)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 5, 1 (1988)
Google Scholar
Elman, J.L.: Finding structure in time. Cogn. Sci. 14, 179–211 (1990). Elsevier
Article Google Scholar
Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Netw. 1, 339–356 (1998). Elsevier
Article Google Scholar
Hochreiter, S.: The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 6, 107–116 (1998). World Scientific
Article MATH Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
MathSciNet MATH Google Scholar
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45, 2673–2681 (1997). IEEE
Article Google Scholar
Sutskever, I., Tenenbaum, J.B., Salakhutdinov, R.R.: Modelling relational data using bayesian clustered tensor factorization. Advances in Neural Information Processing Systems, pp. 1821–1828 (2009)
Google Scholar
Jenatton, R., Roux, N.L., Bordes, A., Obozinski, G.R.: A latent factor model for highly multi-relational data. Advances in Neural Information Processing Systems, pp. 3167–3175 (2012)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)
Google Scholar
Dolan, B., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 350. Association for Computational Linguistics (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Fudan University, 825 Zhangheng Road, Shanghai, China
Yatian Shen, Jifan Chen & Xuanjing Huang

Authors

Yatian Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jifan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xuanjing Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yatian Shen , Jifan Chen or Xuanjing Huang .

Editor information

Editors and Affiliations

Microsoft Research Asia, Beijing, China
Chin-Yew Lin
Brandeis University, Waltham, Massachusetts, USA
Nianwen Xue
Peking University, Beijing, China
Dongyan Zhao
Fudan University, Shanghai, China
Xuanjing Huang
Peking University, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, Y., Chen, J., Huang, X. (2016). Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-50496-4_4
Published: 02 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50495-7
Online ISBN: 978-3-319-50496-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification

Abstract

Access this chapter

Similar content being viewed by others

Paraphrase Identification Based on Weighted URAE, Unit Similarity and Context Correlation Feature

Paraphrase detection using LSTM networks and handcrafted features

A Deep Network Model for Paraphrase Detection in Punjabi

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Bidirectional Long Short-Term Memory with Gated Relevance Network for Paraphrase Identification

Abstract

Access this chapter

Similar content being viewed by others

Paraphrase Identification Based on Weighted URAE, Unit Similarity and Context Correlation Feature

Paraphrase detection using LSTM networks and handcrafted features

A Deep Network Model for Paraphrase Detection in Punjabi

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation