Advertisement

LSTM Based Paraphrase Identification Using Combined Word Embedding Features

  • D. Aravinda ReddyEmail author
  • M. Anand Kumar
  • K. P. Soman
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 898)

Abstract

Paraphrase identification is the process of analyzing two text entities (sentences) and determining whether the two entities represent the similar sense or not. This is a task of Natural Language Processing (NLP) in which we need to identify the sentences whether it is a paraphrase or not. Here, the chosen approach for this task is a deep Learning model that is Recurrent Neural Network-LSTM with word embedding features. Word embedding is an approach, from where we can extract the semantics of the word in dense vector representation. The word embedding models that are used for the feature extraction in Telugu are Word2Vec, Glove and Fasttext. These extracted feature models are added in the embedding layer of Long Short-Term Memory algorithm in order to classify the Telugu sentence pairs whether they are Paraphrase or not. The corpus for Telugu is generated manually from various Telugu newspapers. The sentences for word embedding model is also gathered from Telugu newspapers. This is the first attempt for paraphrase identification in Telugu using deep learning approach.

Keywords

Paraphrase identification Deep learning RNN-LSTM Word embedding model—Word2Vec Glove Fast-text Corpus 

References

  1. 1.
    Brockett, C., Dolan, W.B.: Support vector machines for paraphrase identification and corpus construction. In: Proceedings of the 3rd International Workshop on Paraphrasing (IWP2005), pp. 1–8 (2005)Google Scholar
  2. 2.
    Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for para-phrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011)Google Scholar
  3. 3.
    He, H., Gimpel, K., Lin, J.: Multi-perspective sentence similarity modelling with CNN. In: International Conference on Emperical Methods in NLP, pp. 1576–1586 (2015)Google Scholar
  4. 4.
    Praveena. R., Anand Kumar, M., Soman, K.P.: Chunking based Malayalam paraphrase identification using unfolding recursive autoencoders, pp. 922–928.  https://doi.org/10.1109/ICACCI.2017.8125959
  5. 5.
    Mahalaksmi, S., Anand Kumar, M., Soman, K.P.: Paraphrase detection for Tamil language using deep learning algorithms. Int. J. Appl. Eng. Res. 10(17), 13929–13934 (2015)Google Scholar
  6. 6.
    Abraham, S.S., Idicula, S.M.: Comparison of statistical and semantic similarity techniques for paraphrase identification, pp. 209–213. IEEE (2012)Google Scholar
  7. 7.
    He, H., Gimpel, K., Lin, J.: Emperical Methods in NLP, pp. 1576–1586. (2015)Google Scholar
  8. 8.
    Chitra, A., Rajkumar, A.: Paraphrase extraction using fuzzy hierarchical clustering. Appl. Soft Comput. 34, 426–437 (2015)CrossRefGoogle Scholar
  9. 9.
    Fernando, S., Stevenson, M.: A semantic similarity approach to paraphrase detection. In: Proceedings of the 11th Annual Research Colloquium of the UK Special Interest Group for Computational Linguistics, pp. 45–52 (2008)Google Scholar
  10. 10.
    Mahalaksmi, S., Anand Kumar, M., Soman, K.P.: Paraphrase detection for Tamil language using deep learning algorithms. Int. J. Appl. Eng. Res. 10(17), 13929–13934 (2015)Google Scholar
  11. 11.
    Chitra, A., Rajkumar, A.: Paraphrase extraction using fuzzy hierarchical clustering. Appl. Soft Comput. 34, 426–437 (2015)CrossRefGoogle Scholar
  12. 12.
    Aravinda Reddy, D., Anand Kumar, M., Soman, K.P.: Paraphrase identification in Telugu using machine learning. In: Advances in Intelligent Systems and Computing. Springer (2018)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • D. Aravinda Reddy
    • 1
    Email author
  • M. Anand Kumar
    • 1
  • K. P. Soman
    • 1
  1. 1.Amrita School of EngineeringCenter for Computational Engineering and Networking (CEN), Coimbatore Amrita Vishwa VidyapeethamCoimbatoreIndia

Personalised recommendations