Abstract
The extraction of implicit citations becomes more important since it is a fundamental step in many other applications such as paper summarization, citation sentiment analysis, citation classification, etc. This paper describes the limitations of previous works in citation extraction and then proposes a new approach which is based on topic modeling and word embedding. As a first step, our approach uses LDA technique to identify the topics discussed in the cited paper. Following the same idea of Doc2Vec technique, our approach proposes two models. The first one called Sentence2Vec and it is used to represent all sentences following an explicit citation. This sentences are candidates to be implicit citation sentences. The second model called Topic2Vec, used to represent the topics covered in the cited paper. Based on the similarity between Sentence2Vec and Topic2Vec representations we can label a candidate sentence as implicit or not.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abu-Jbara, A., Radev, D.R.: Reference scope identification in citing sentences. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Montréal, Canada, pp. 80–90 (2012)
Abu-Jbara, A., Ezra, J., Radev, D.R.: Purpose and polarity of citation: towards NLP-based bibliometrics. In: Proceedings of the North American Association for Computational Linguistics, Atlanta, Georga, USA, pp. 596–606 (2013)
Alghamdi, R., Alfalqi, K.: A survey of topic modeling in text mining. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 6, 147–153 (2015)
Athar, A.: Sentiment analysis of citations using sentence structure-based features. In: Proceedings of the ACL 2011 Student Session, pp. 81–87 (2011)
Athar, A., Teufel, S.: Context-enhanced citation sentiment detection. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Montreal, Canada, pp. 587–601 (2012)
Bu, Y., Wang, B., Huang, W.B., Che, S., Huang, Y.: Using the appearance of citations in full text on author co-citation analysis. Scientometrics 116(1), 275–289 (2018)
David, M.B., Andrew, Y.N., Michael, I.J.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Fortunato, S. et al.: Science of science. Science, 359(1007) (2018)
Hernandez-Alvarez, M., Gomez, J.M.: Survey about citation context analysis: tasks, techniques, and resources. Nat. Lang. Eng. 22(3), 327–349 (2015)
Jochim, C., Schutze, H.: Improving citation polarity classification with product reviews. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics, pp. 42–48. ACL, Baltimore (2014)
Kaplan, D., Iida, R., Tokunaga, T.: Automatic extraction of citation contexts for research paper summarization: a coreference-chain based approach. In: Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, Singapore, pp. 88–95 (2009)
Kim, I.C., Le, D.X., Thoma, G.R.: Automated method for extracting citation sentences from online biomedical articles using SVM-based text summarization technique. In: Paper Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC), San Diego, CA, USA, pp. 1991–1996 (2014)
O’Connor, J.: Citing statements: computer recognition and use to improve retrieval. Inf. Process. Manag. 18(3), 125–131 (1982)
Qazvinian, V., Radev, D.R.: Identifying non-explicit citing sentences for citation-based summarization. In: Proceedings of 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 555–564 (2010)
Quoc., L.E., Tomas. M.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, Beijing, China (2014)
Radev, D.R., Muthukrishnan, P., Qazvinian, V.: The ACL anthology network corpus. Lang. Resour. Eval. 47(4), 919–944 (2013)
Small, H.: Interpreting maps of science using citation context sentiments: a preliminary investigation. Scientometrics 87, 373–388 (2011)
Sondhi, P., Zhai, C.X.: A constrained hidden Markov model approach for non-explicit citation context extraction. In: Proceedings of the 2014 SIAM International Conference on Data Mining, pp. 361–369 (2014)
Sugiyama, K., Kumar, T., Kan, M.Y., Tripathi. R.C.: Identifying citing sentences in research papers using supervised learning. In: Proceedings of the 2010 International Conference on Information Retrieval and Knowledge Management, Malaysia, pp. 67–72 (2010)
Yousif, A.: A survey on sentiment analysis of scientific citations. Artif. Intell. Rev. 1–34 (2017). https://doi.org/10.1007/s10462-017-9597-8
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Jebari, C., Cobo, M.J., Herrera-Viedma, E. (2018). A New Approach for Implicit Citation Extraction. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11315. Springer, Cham. https://doi.org/10.1007/978-3-030-03496-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-03496-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03495-5
Online ISBN: 978-3-030-03496-2
eBook Packages: Computer ScienceComputer Science (R0)