A deep extraction model for an unseen keyphrase detection

  • Amin Ghazi ZahediEmail author
  • Morteza Zahedi
  • Mansoor Fateh


The keyphrase represents the basic concepts for a text. In many natural language processing tasks, it is necessary to extract qualitative keyphrases. Considering previous studies regarding text modeling, the meanings and concepts associated with the text had not been particularly considered as significant. According to recent research, cluster-related documents have a good subscription, especially in the keyphrases that are not directly appearing in a text document. Therefore, in this study, the main structure of the proposed model is based on the keyphrases disappearing in the document. We called it unseen keyphrase. Considering the proposed method, a model is developed to extract the basic concepts of the text using the same text estimates and through adding keyphrases to the deep network hidden layers of training. The main purpose of this structure is to first make visible unseen keyphrase and then to use an RNN to predict them. Considering the proposed method, the problem of not representing basic concepts and the unseen keyphrase are significantly solved. This study provides new insight into the concept of text. This mechanism is used by highlighting the role of unseen keyphrase that appears directly without the need for external knowledge. This method is tested on four public datasets in this field. The results revealed an average improvement of 12% compared to the public methods such as TF-IDF, KEA, and RNN.


Keyphrase extraction Sequence modeling Clustering Deep neural network RNN 


Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.


  1. Alrehamy H, Walker C (2018) Exploiting extensible background knowledge for clustering-based automatic keyphrase extraction. Soft Comput 22(21):7041–7057CrossRefGoogle Scholar
  2. Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep canonical correlation analysis. In: International conference on machine learning, vol 28, pp 1247–55Google Scholar
  3. Atarashi K (2018) Semi-supervised learning from crowds using deep generative models, pp 1555–1562Google Scholar
  4. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate, pp 1–15Google Scholar
  5. Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886CrossRefGoogle Scholar
  6. Cai D, He X, Han J (2011) Locally consistent concept factorization for document clustering. IEEE Trans Knowl Data Eng 23(6):902–913CrossRefGoogle Scholar
  7. De Soete G, Carroll JD (1994) K-means clustering in a low-dimensional Euclidean space. In: New approaches in classification and data analysis. Springer, pp 212–19Google Scholar
  8. Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on world wide web—WWW ’09. ACM Press, New YorkGoogle Scholar
  9. Gu J, Lu Z, Li H, Li VOK (2016) Incorporating copying mechanism in sequence-to-sequence learning. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Stroudsburg, pp 1631–1640Google Scholar
  10. Hasan KS, Ng V (2010) Conundrums in unsupervised keyphrase extraction: making sense of the state-of-the-art. In 23rd international conference on computational linguistics association for computational linguistics coling 2010, pp 365–73Google Scholar
  11. Hershey JR, Chen Z, Roux JL, Watanabe S (2016) Deep clustering: discriminative embeddings for segmentation and separation. In: 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) IEEE, pp 31–35Google Scholar
  12. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507MathSciNetCrossRefGoogle Scholar
  13. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  14. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366CrossRefGoogle Scholar
  15. Hulth A, Megyesi BB (2006) A study on automatically extracted keywords in text categorization. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the ACL–ACL ’06. Association for Computational Linguistics, Morristown, pp 537–44Google Scholar
  16. Liu Z, Chen X, Zheng Y, Sun M (2011) Automatic keyphrase extraction by bridging vocabulary gap. In: Proceedings of the fifteenth conference on computational natural language learning. Association for Computational Linguistics, pp 135–44Google Scholar
  17. Liu J, Ren H, Wu M, Wang J, Kim H (2017) Multiple relations extraction among multiple entities in unstructured text. Soft Comput 22:4295–4305CrossRefGoogle Scholar
  18. Medelyan O, Frank E, Witten IH (2009) Human-competitive tagging using automatic keyphrase extraction. In: Proceedings of the 2009 conference on empirical methods in natural language processing: volume 3. Association for Computational Linguistics, pp 1318–27Google Scholar
  19. Meng R, Zhao S, Han S, He D, Brusilovsky P, Chi Y (2017) Deep keyphrase generation. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers). Association for Computational Linguistics, Stroudsburg, pp 582–92Google Scholar
  20. Mihalcea R (2004) Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL 2004 on interactive poster and demonstration sessions, vol 85. Association for Computational Linguistics, Morristown, pp 20Google Scholar
  21. Mihalcea R, Tarau P (2004) TextRank: bringing order into texts. Proc EMNLP 85:404–411Google Scholar
  22. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–14Google Scholar
  23. Ng A (2011) Sparse autoencoder. CS294A Lecture Notes, pp 1–19Google Scholar
  24. Patel VM, Van Nguyen H, Vidal RR, Van Nguyen H, Vidal RR (2013) Latent space sparse subspace clustering. In: Proceedings of the IEEE international conference on computer vision, pp 225–32Google Scholar
  25. Rush AM, Chopra S, Weston J (2015) A neural attention model for abstractive sentence summarization. ArXiv Preprint arXiv:1509.00685
  26. Shen S, Cheng Y, He Z, He W, Wu H, Sun M, Liu Y (2015) Minimum risk training for neural machine translation. ArXiv Preprint arXiv:1512.02433
  27. Van Merri B, Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. ArXiv Preprint arXiv:1406.1078
  28. Vinyals O, Le Q (2015) A neural conversational model. ArXiv Preprint arXiv:1506.05869
  29. Vinyals O, Kaiser Ł, Koo T, Petrov S, Sutskever I, Hinton G (2015) Grammar as a foreign language. In: Advances in neural information processing systems, pp 2773–2781Google Scholar
  30. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, vol 48, pp 478–87Google Scholar
  31. Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization.” In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 267–73Google Scholar
  32. Yang J, Parikh D, Batra D (2016) Joint unsupervised learning of deep representations and image clusters. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 5147–56Google Scholar
  33. Yang B, Xiao F, Sidiropoulos ND (2017a) Learning from hidden traits: joint factor analysis and latent clustering. IEEE Trans Signal Process 65(1):256–269MathSciNetCrossRefGoogle Scholar
  34. Yang B, Fu X, Sidiropoulos ND, Hong M (2017b) Towards K-means-friendly spaces: simultaneous deep learning and clustering. In: 34th international conference on machine learning, ICML 2017, 8, pp 5888–5901Google Scholar
  35. Yu J, Liu H, Zheng X (2019) Two-dimensional joint local and nonlocal discriminant analysis-based 2D image feature extraction for deep learning. Neural Comput Appl. CrossRefGoogle Scholar
  36. Zhang Q, Wang Y, Gong Y, Huang X (2016) Keyphrase extraction using deep recurrent neural networks on twitter. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 836–45Google Scholar
  37. Zreik C, Bouveyron P, Latouche R (2016) The stochastic topic block model for the clustering of vertices in networks with textual edges. Statistics and Computing 28:11–31MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Faculty of Computer Engineering and Information TechnologyShahrood University of TechnologyShahroodIran

Personalised recommendations