Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction

Basaldella, Marco; Antolli, Elisa; Serra, Giuseppe; Tasso, Carlo

doi:10.1007/978-3-319-73165-0_18

Marco Basaldella¹¹,
Elisa Antolli¹¹,
Giuseppe Serra¹¹ &
…
Carlo Tasso¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 806))

Included in the following conference series:

Italian Research Conference on Digital Libraries

1283 Accesses
31 Citations

Abstract

To achieve state-of-the-art performance, keyphrase extraction systems rely on domain-specific knowledge and sophisticated features. In this paper, we propose a neural network architecture based on a Bidirectional Long Short-Term Memory Recurrent Neural Network that is able to detect the main topics on the input documents without the need of defining new hand-crafted features. A preliminary experimental evaluation on the well-known INSPEC dataset confirms the effectiveness of the proposed solution.

M. Basaldella and E. Antolli—Equally Contributed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Al-Rfou, R., et al.: Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688, May 2016, http://arxiv.org/abs/1605.02688
Basaldella, M., Chiaradia, G., Tasso, C.: Evaluating anaphora and coreference resolution to improve automatic keyphrase extraction. In: Proceedings of International Conference on Computational Linguistics (2016)
Google Scholar
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python, 1st edn. O’Reilly Media Inc., Sebastopol (2009)
MATH Google Scholar
Bougouin, A., Boudin, F., Daille, B.: Topicrank: graph-based topic ranking for keyphrase extraction. In: Proceedings of International Joint Conference on Natural Language Processing (2013)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Degl’Innocenti, D., De Nart, D., Tasso, C.: A new multi-lingual knowledge-base approach to keyphrase extraction for the Italian language. In: Proceedings of International Conference on Knowledge Discovery and Information Retrieval (2014)
Google Scholar
Gers, F.A., Schraudolph, N.N., Schmidhuber, J.: Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 3, 115–143 (2002)
MathSciNet MATH Google Scholar
Graves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5), 602–610 (2005)
Article Google Scholar
Haddoud, M., Abdeddaim, S.: Accurate keyphrase extraction by discriminating overlapping phrases. J. Inf. Sci. 40(4), 488–500 (2014)
Article Google Scholar
Hammouda, K.M., Matute, D.N., Kamel, M.S.: CorePhrase: keyphrase extraction for document clustering. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS (LNAI), vol. 3587, pp. 265–274. Springer, Heidelberg (2005). https://doi.org/10.1007/11510888_26
Chapter Google Scholar
Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2014)
Google Scholar
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies (2001)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (2003)
Google Scholar
Jones, S., Staveley, M.S.: Phrasier: a system for interactive document retrieval using keyphrases. In: Proceedings of International ACM SIGIR Conference on Research and development in Information Retrieval (1999)
Google Scholar
Kim, S.N., Medelyan, O., Kan, M.Y., Baldwin, T.: Semeval-2010 task 5: automatic keyphrase extraction from scientific articles. In: Proceedings of International Workshop on Semantic Evaluation (2010)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of International Conference on Learning Representations (2014)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of International Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2016)
Google Scholar
Lopez, P., Romary, L.: HUMB: automatic key term extraction from scientific articles in GROBID. In: Proceedings of International Workshop on Semantic Evaluation (2010)
Google Scholar
Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., Chi, Y.: Deep keyphrase generation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, pp. 582–592. Association for Computational Linguistics (2017). http://aclanthology.coli.uni-saarland.de/pdf/P/P17/P17-1054.pdf
Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. In: Proceedings of Empirical Methods on Natural Language Processing (2004)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
De Nart, D., Degl’Innocenti, D., Basaldella, M., Agosti, M., Tasso, C.: A content-based approach to social network analysis: a case study on research communities. In: Calvanese, D., De Nart, D., Tasso, C. (eds.) IRCDL 2015. CCIS, vol. 612, pp. 142–154. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41938-1_15
De Nart, D., Degl’Innocenti, D., Pavan, A., Basaldella, M., Tasso, C.: Modelling the User Modelling Community (and Other Communities as Well). In: Ricci, F., Bontcheva, K., Conlan, O., Lawless, S. (eds.) UMAP 2015. LNCS, vol. 9146, pp. 357–363. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20267-9_31
Palangi, H., Deng, L., Shen, Y., Gao, J., He, X., Chen, J., Song, X., Ward, R.: Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Trans. Audio Speech Lang. Process. 24(4), 694–707 (2016)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of Empirical Methods on Natural Language Processing (2014)
Google Scholar
Rush, A.M., Chopra, S., Weston, J.: A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015)
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tan, M., Xiang, B., Zhou, B.: LSTM-based deep learning models for non-factoid answer selection. CoRR abs/1511.04108 (2015). http://arxiv.org/abs/1511.04108
Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (2003)
Google Scholar
Turney, P.D.: Learning algorithms for keyphrase extraction. Inf. Retriev. 2(4), 303–336 (2000)
Article Google Scholar
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G.: KEA: practical automatic keyphrase extraction. In: Proceedings of ACM Conference on Digital Libraries, pp. 254–255 (1999)
Google Scholar
Zhang, Q., Wang, Y., Gong, Y., Huang, X.: Keyphrase extraction using deep recurrent neural networks on Twitter. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (2016)
Google Scholar
Zhang, Y., Zincir-Heywood, N., Milios, E.: World Wide Web site summarization. Web Intell. Agent Syst. 2(1), 39–53 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Artificial Intelligence Laboratory, Department of Mathematics, Computer Science, and Physics, University of Udine, Udine, Italy
Marco Basaldella, Elisa Antolli, Giuseppe Serra & Carlo Tasso

Authors

Marco Basaldella
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Antolli
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Serra
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Tasso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Basaldella .

Editor information

Editors and Affiliations

University of Udine, Udine, Italy
Giuseppe Serra
University of Udine, Udine, Italy
Carlo Tasso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Basaldella, M., Antolli, E., Serra, G., Tasso, C. (2018). Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction. In: Serra, G., Tasso, C. (eds) Digital Libraries and Multimedia Archives. IRCDL 2018. Communications in Computer and Information Science, vol 806. Springer, Cham. https://doi.org/10.1007/978-3-319-73165-0_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-73165-0_18
Published: 21 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73164-3
Online ISBN: 978-3-319-73165-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics