Abstract
Finding hidden topics and latest topic influential papers in a corpus can help researchers get a quick overview and recent development of a scientific research field. Existing work focused on finding milestone papers which are usually published many years ago. Finding latest influential papers is a more challenging problem due to lack of enough citation information of newly published papers. In this paper, we study this problem and propose a novel way of modeling citation links with a probabilistic generative model. The key idea is to consider two views of citation, both citing and being cited of each paper. Through this idea, we can not only model topic dependence between cited and citing papers but also incorporate latest papers into our model. Based on these ideas, we jointly model the two views with an extension of topic model, Bi-Citation-LDA model, which can not only find previous important papers but also find newly published influential papers in each topic. Experiments on real dataset and comparison with existing methods indicate that our model can effectively find latest topic influential papers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wang, X., Zhai, C., Roth, D.: Understanding evolution of research themes: a probabilistic generative model for citations. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1115–1123. ACM (2013)
Nallapati, R.M., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 542–550. ACM (2008)
Erosheva, E., Fienberg, S., Lafferty, J.: Mixed-membership models of scientific publications. Proc. Nat. Acad. Sci. USA 101(Suppl. 1), 5220–5227 (2004)
Eugene, G.: Citation analysis as a tool in journal evaluation. Am. Assoc. Advance. Sci. 178(4060), 471–479 (1972)
Mirton, K.M.: Bibliographic coupling between scientific papers. Am. Documentation 14(1), 10–25 (1963)
Kevin, B.W., Richard, K.: Co-citation analysis, bibliographic coupling, and direct citation: which citation approach represents the research front most accurately? J. Am. Soc. Inf. Sci. Technol. 61(12), 2389–2404 (2010)
David, M.B., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Nat. Acad. Sci. USA 101(Supp. 1), 5228–5235 (2004)
Saurabh, K., Mitra, P., Bhatia, S.: Utilizing context in generative bayesian models for linked corpus. In: AAAI, vol. 10 (2010)
Dragomir, R.R., Muthukrishnan, P., Qazvinian, V.: The ACL anthology network corpus. In: Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, pp. 54–61. Association for Computational Linguistics. (2009)
Limin, Y., Mimno, D., McCallum, A.: Efficient methods for topic model inference on streaming document collections. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 937–946. ACM (2009)
Acknowledgement
This work was supported in part by National Natural Science Foundation of China under grant No. 71272029, 71432004, 71490724 and 61472426, the 863 program under grant No. 2014AA015204, and the Beijing Municipal Natural Science Foundation under grant No. 4152026.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Huang, L., Liu, H., He, J., Du, X. (2016). Finding Latest Influential Research Papers Through Modeling Two Views of Citation Links. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9931. Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-45814-4_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45813-7
Online ISBN: 978-3-319-45814-4
eBook Packages: Computer ScienceComputer Science (R0)