PTR: Phrase-Based Topical Ranking for Automatic Keyphrase Extraction in Scientific Publications

  • Minmei Wang
  • Bo Zhao
  • Yihua HuangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9950)


Automatic keyphrase extraction plays an important role for many information retrieval (IR) and natural language processing (NLP) tasks. Motivated by the facts that phrases have more semantic information than single words and a document consists of multiple semantic topics, we present PTR, a phrase-based topical ranking method for keyphrase extraction in scientific publications. Candidate keyphrases are divided into different topics by LDA and used as vertices in a phrase-based graph of the topic. We then decompose PageRank into multiple weighted-PageRank to rank phrases for each topic. Keyphrases are finally generated by selecting candidates according to their overall scores on all related topics. Experimental results show that PTR has good performance on several datasets.


Automatic keyphrase extraction LDA PageRank 



This work was supported by China NSF Grants (No. 61572250 and No. 61223003) and Jiangsu Province Industry Support Program (BE2014131).


  1. 1.
    Nguyen, T.D., Kan, M.-Y.: Keyphrase extraction in scientific publications. In: Goh, D.H.-L., Cao, T.H., Sølvberg, I.T., Rasmussen, E. (eds.) ICADL 2007. LNCS, vol. 4822, pp. 317–326. Springer, Heidelberg (2007)Google Scholar
  2. 2.
    Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of Association for Computational Linguistics (ACL). Association for Computational Linguistics, Baltimore, Maryland (2014)Google Scholar
  3. 3.
    Mihalcea, R., Tarau P.: TextRank: bringing order into texts. Association for Computational Linguistics (2004)Google Scholar
  4. 4.
    Liu, Z., Huang, W., Zheng, Y., et al.: Automatic keyphrase extraction via topic decomposition. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, pp. 366–376. Association for Computational Linguistics (2010)Google Scholar
  5. 5.
    Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: Annual Meeting-Association for Computational Linguistics. vol. 45, no. 1, p. 552 (2007)Google Scholar
  6. 6.
    Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, vol. 18, pp. 33–40. Association for Computational Linguistics (2003)Google Scholar
  7. 7.
    Bougouin, A., Boudin, F., Topicrank, D.B.: Graph-based topic ranking for keyphrase extraction. In: International Joint Conference on Natural Language Processing (IJCNLP), pp. 543–551 (2013)Google Scholar
  8. 8.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  9. 9.
    Barker, K., Cornacchia, N.: Using noun phrase heads to extract document keyphrases. In: Hamilton, H.J. (ed.) Canadian AI 2000. LNCS (LNAI), vol. 1822, pp. 40–52. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  10. 10.
    Kim, S.N., Medelyan, O., Kan, M.Y. Semeval- task 5: automatic keyphrase extraction from scientific articles. In: Proceedings of 5th International Workshop on Semantic Evaluation, pp. 21–26. Association for Computational Linguistics (2010)Google Scholar
  11. 11.
    Krapivin, M., Autaeu, A., Marchese, M.: Large dataset for keyphrases extraction (2009)Google Scholar
  12. 12.
    Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of conference on Empirical Methods in Natural Language Processing, pp. 216–223. Association for Computational Linguistics (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.The National Key Laboratory for Novel Software Technology, Department of Computer Science and TechnologyNanjing UniversityNanjingChina
  2. 2.Collaborative Innovation Center of Novel Software Technology and IndustrializationNanjingChina

Personalised recommendations