Skip to main content

Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction

  • Conference paper
  • First Online:
Digital Libraries and Multimedia Archives (IRCDL 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 806))

Included in the following conference series:

Abstract

To achieve state-of-the-art performance, keyphrase extraction systems rely on domain-specific knowledge and sophisticated features. In this paper, we propose a neural network architecture based on a Bidirectional Long Short-Term Memory Recurrent Neural Network that is able to detect the mainĀ topics on the input documents without the need of defining new hand-crafted features. A preliminary experimental evaluation on the well-known INSPEC dataset confirms the effectiveness of the proposed solution.

M. Basaldella and E. Antolliā€”Equally Contributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Rfou, R., et al.: Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688, May 2016, http://arxiv.org/abs/1605.02688

  2. Basaldella, M., Chiaradia, G., Tasso, C.: Evaluating anaphora and coreference resolution to improve automatic keyphrase extraction. In: Proceedings of International Conference on Computational Linguistics (2016)

    Google ScholarĀ 

  3. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python, 1st edn. Oā€™Reilly Media Inc., Sebastopol (2009)

    MATHĀ  Google ScholarĀ 

  4. Bougouin, A., Boudin, F., Daille, B.: Topicrank: graph-based topic ranking for keyphrase extraction. In: Proceedings of International Joint Conference on Natural Language Processing (2013)

    Google ScholarĀ 

  5. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493ā€“2537 (2011)

    MATHĀ  Google ScholarĀ 

  6. Deglā€™Innocenti, D., De Nart, D., Tasso, C.: A new multi-lingual knowledge-base approach to keyphrase extraction for the Italian language. In: Proceedings of International Conference on Knowledge Discovery and Information Retrieval (2014)

    Google ScholarĀ 

  7. Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 3, 115ā€“143 (2002)

    MathSciNetĀ  MATHĀ  Google ScholarĀ 

  8. Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5), 602ā€“610 (2005)

    ArticleĀ  Google ScholarĀ 

  9. Haddoud, M., Abdeddaim, S.: Accurate keyphrase extraction by discriminating overlapping phrases. J. Inf. Sci. 40(4), 488ā€“500 (2014)

    ArticleĀ  Google ScholarĀ 

  10. Hammouda, K.M., Matute, D.N., Kamel, M.S.: CorePhrase: keyphrase extraction for document clustering. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS (LNAI), vol. 3587, pp. 265ā€“274. Springer, Heidelberg (2005). https://doi.org/10.1007/11510888_26

    ChapterĀ  Google ScholarĀ 

  11. Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2014)

    Google ScholarĀ 

  12. Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies (2001)

    Google ScholarĀ 

  13. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735ā€“1780 (1997)

    ArticleĀ  Google ScholarĀ 

  14. Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (2003)

    Google ScholarĀ 

  15. Jones, S., Staveley, M.S.: Phrasier: a system for interactive document retrieval using keyphrases. In: Proceedings of International ACM SIGIR Conference on Research and development in Information Retrieval (1999)

    Google ScholarĀ 

  16. Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: Semeval-2010 task 5: automatic keyphrase extraction from scientific articles. In: Proceedings of International Workshop on Semantic Evaluation (2010)

    Google ScholarĀ 

  17. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations (2014)

    Google ScholarĀ 

  18. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of International Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2016)

    Google ScholarĀ 

  19. Lopez, P., Romary, L.: HUMB: automatic key term extraction from scientific articles in GROBID. In: Proceedings of International Workshop on Semantic Evaluation (2010)

    Google ScholarĀ 

  20. Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., Chi, Y.: Deep keyphrase generation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, pp. 582ā€“592. Association for Computational Linguistics (2017). http://aclanthology.coli.uni-saarland.de/pdf/P/P17/P17-1054.pdf

  21. Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: Proceedings of Empirical Methods on Natural Language Processing (2004)

    Google ScholarĀ 

  22. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111ā€“3119 (2013)

    Google ScholarĀ 

  23. De Nart, D., Deglā€™Innocenti, D., Basaldella, M., Agosti, M., Tasso, C.: A content-based approach to social network analysis: a case study on research communities. In: Calvanese, D., De Nart, D., Tasso, C. (eds.) IRCDL 2015. CCIS, vol. 612, pp. 142ā€“154. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41938-1_15

  24. De Nart, D., Deglā€™Innocenti, D., Pavan, A., Basaldella, M., Tasso, C.: Modelling the User Modelling Community (and Other Communities as Well). In: Ricci, F., Bontcheva, K., Conlan, O., Lawless, S. (eds.) UMAP 2015. LNCS, vol. 9146, pp. 357ā€“363. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20267-9_31

  25. Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., Song, X., Ward, R.: Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Trans. Audio Speech Lang. Process. 24(4), 694ā€“707 (2016)

    ArticleĀ  Google ScholarĀ 

  26. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of Empirical Methods on Natural Language Processing (2014)

    Google ScholarĀ 

  27. Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)

  28. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929ā€“1958 (2014)

    MathSciNetĀ  MATHĀ  Google ScholarĀ 

  29. Tan, M., Xiang, B., Zhou, B.: LSTM-based deep learning models for non-factoid answer selection. CoRR abs/1511.04108 (2015). http://arxiv.org/abs/1511.04108

  30. Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (2003)

    Google ScholarĀ 

  31. Turney, P.D.: Learning algorithms for keyphrase extraction. Inf. Retriev. 2(4), 303ā€“336 (2000)

    ArticleĀ  Google ScholarĀ 

  32. Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automatic keyphrase extraction. In: Proceedings of ACM Conference on Digital Libraries, pp. 254ā€“255 (1999)

    Google ScholarĀ 

  33. Zhang, Q., Wang, Y., Gong, Y., Huang, X.: Keyphrase extraction using deep recurrent neural networks on Twitter. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (2016)

    Google ScholarĀ 

  34. Zhang, Y., Zincir-Heywood, N., Milios, E.: World Wide Web site summarization. Web Intell. Agent Syst. 2(1), 39ā€“53 (2004)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Basaldella .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Basaldella, M., Antolli, E., Serra, G., Tasso, C. (2018). Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction. In: Serra, G., Tasso, C. (eds) Digital Libraries and Multimedia Archives. IRCDL 2018. Communications in Computer and Information Science, vol 806. Springer, Cham. https://doi.org/10.1007/978-3-319-73165-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73165-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73164-3

  • Online ISBN: 978-3-319-73165-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics