Abstract
Due to the development of academic, more and more attentions are paid to citation recommendation. To solve the citation recommendation problem, researchers begin to focus on the network representation, because it fuses semantic information and structural information well. It is a big challenge that how to map articles in a heterogeneous information network into a low-dimensional space while preserving the potential associations between articles. We propose a novel citation recommendation algorithm based on citation tendency, named CIRec which learns more about the potential relationship of articles in the process of network embedding. Citation tendency means if an article can be selected as a reference, it probability satisfies some kinds of conditions. In our algorithm, five weight matrices which represent the probability of entity-to-entity migration based on citation tendency are defined to build weighted heterogeneous network first. Second, we design a biased random walk procedure which efficiently explores articles’ characteristics and citations information. Finally, the skip-gram model is used to learn the neighborhood relationship of the nodes in the walk sequence and map the nodes to the vector space. Comparing with existing state-of-the-art technique, experiment results show that our algorithm CIRec has better recall, precision, NDCG on AAN and DBLP dataset.
Keywords
Citation recommendation Citation tendency Heterogeneous information network Network representationMathematics Subject Classification
68T99JEL Classification
C63 C89Notes
Acknowledgements
This work was partially supported by National Natural Science Foundation of China (Grants \#61876001, \#61602003 and \#61673020), National Key Research and Development Program of China (Grant \#2017YFB1401903), the Provincial Natural Science Foundation of Anhui Province (Grant \#1708085QF156), and the Recruitment Project of Anhui University for Academic and Technology Leader.
References
- Ayala-Gómez, F., Daróczy, B., Benczúr, A., Mathioudakis, M., & Gionis, A. (2018). Global citation recommendation using knowledge graphs. Journal of Intelligent & Fuzzy Systems, 34(5), 3089–3100.CrossRefGoogle Scholar
- Bradshaw, S. (2003). Reference directed indexing: Redeeming relevance for subject search in citation indexes. In International conference on theory and practice of digital libraries (pp. 499–510): Springer.Google Scholar
- Cai, X., Han, J., Li, W., Zhang, R., Pan, S., & Yang, L. (2018a). A three-layered mutually reinforced model for personalized citation recommendation. IEEE Transactions on Neural Networks and Learning Systems, 29, 6026–6037.CrossRefGoogle Scholar
- Cai, X., Han, J., & Yang, L. (2018b) Generative adversarial network based heterogeneous bibliographic network representation for personalized citation recommendation. In AAAI, New Orleans, USA.Google Scholar
- Chandrasekaran, K., Gauch, S., Lakkaraju, P., & Luong, H. P. (2008). Concept-based document recommendations for citeseer authors. In International conference on adaptive hypermedia and adaptive web-based systems (pp. 83–92): Springer.Google Scholar
- Dong, Y., Chawla, N. V., & Swami, A. (2017) metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, Halifax, NS, Canada (pp. 135–144). ACM.Google Scholar
- Ebesu, T., & Fang, Y. (2017). Neural citation network for context-aware citation recommendation. In Proceedings of the 40th international ACM SIGIR conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan (pp. 1093–1096). ACM.Google Scholar
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014) Generative adversarial nets. In Advances in neural information processing systems, Palais des Congrès de Montréal, Montréal, Canada (pp. 2672–2680).Google Scholar
- Gori, M., & Pucci, A. (2006) Research paper recommender systems: A random-walk based approach. In IEEE/WIC/ACM International Conference on Web Intelligence, 2006. WI 2006, Hong Kong, China (pp. 778–781). IEEE.Google Scholar
- Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, California, USA (pp. 855–864). ACM.Google Scholar
- Gui, H., Liu, J., Tao, F., Jiang, M., Norick, B., & Han, J. (2016). Large-scale embedding learning in heterogeneous event data. In 2016 IEEE 16th international conference on data mining (ICDM), Barcelona, Spain (pp. 907–912). IEEE.Google Scholar
- Guo, L., Cai, X., Hao, F., Mu, D., Fang, C., & Yang, L. (2017). Exploiting fine-grained co-authorship for personalized citation recommendation. IEEE Access, 5, 12714–12725.CrossRefGoogle Scholar
- Gupta, S., & Varma, V. (2017) Scientific article recommendation by using distributed representations of text and graph. In Proceedings of the 26th international conference on World Wide Web Companion, Perth, Australia (pp. 1267–1268). International World Wide Web Conferences Steering Committee.Google Scholar
- Huang, W., Wu, Z., Liang, C., Mitra, P., & Giles, C. L. (2015). A neural probabilistic model for context based citation recommendation. In Twenty-ninth AAAI conference on artificial intelligence.Google Scholar
- Jardine, J., & Teufel, S. (2014) Topical PageRank: A model of scientific expertise for bibliographic search. In Proceedings of the 14th conference of the European Chapter of the Association for Computational Linguistics, Gothenburg, Sweden (pp. 501–510).Google Scholar
- Jie, T., Jing, Z., Yao, L., Li, J., & Zhong, S. (2008). ArnetMiner: Extraction and mining of academic social networks. In Acm Sigkdd international conference on knowledge discovery & data mining.Google Scholar
- Le, Q., & Mikolov, T. (2014) Distributed representations of sentences and documents. In International conference on machine learning, Beijing, China (pp. 1188–1196).Google Scholar
- Fu, T.-y., Lee, W.-C., & Lei, Z. (2017). Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In Proceedings of the 2017 ACM on conference on information and knowledge management, Singapore, Singapore (pp. 1797–1806). ACM.Google Scholar
- Meng, F., Gao, D., Li, W., Sun, X., & Hou, Y. (2013). A unified graph model for personalized query-oriented reference paper recommendation. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management, San Francisco, California, USA (pp. 1509–1512). ACM.Google Scholar
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. J. C. S. (2013a). Efficient estimation of word representations in vector space.Google Scholar
- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b) Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, Lake Tahoe, Nevada, United States (pp. 3111–3119).Google Scholar
- Mu, D., Guo, L., Cai, X., & Hao, F. (2018). Query-focused personalized citation recommendation with mutually reinforced ranking. IEEE Access, 6, 3107–3119.CrossRefGoogle Scholar
- Nallapati, R. M., Ahmed, A., Xing, E. P., & Cohen, W. W. (2008). Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 542–550). ACM.Google Scholar
- Pan, L., Dai, X., Huang, S., & Chen, J. (2015). Academic paper recommendation based on heterogeneous graph. Chinese computational linguistics and natural language processing based on naturally annotated big data (pp. 381–392). New York: Springer.CrossRefGoogle Scholar
- Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, New York, USA (pp. 701–710). ACM.Google Scholar
- Qiu, J., Dong, Y., Ma, H., Li, J., Wang, K., & Tang, J. (2018). Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In Proceedings of the eleventh ACM international conference on Web Search and Data Mining, Marina Del Rey, CA, USA (pp. 459–467). ACM.Google Scholar
- Radev, D. R., Muthukrishnan, P., Qazvinian, V., & Abu-Jbara, A. (2013). The ACL anthology network corpus. Language Resources and Evaluation, 47(4), 919–944.CrossRefGoogle Scholar
- Seyler, D., Chandar, P., & Davis, M. (2018). An information retrieval framework for contextual suggestion based on heterogeneous information network embeddings. In The 41st international ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 953–956). ACM.Google Scholar
- Tang, J., Qu, M., & Mei, Q. (2015a) Pte: Predictive text embedding through large-scale heterogeneous text networks. In Proceedings of the 21th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia (pp. 1165–1174). ACM.Google Scholar
- Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015b) Line: Large-scale information network embedding. In Proceedings of the 24th international conference on World Wide Web, Florence, Italy (pp. 1067–1077). International World Wide Web Conferences Steering Committee.Google Scholar
- Yang, L., Zheng, Y., Cai, X., Dai, H., Mu, D., Guo, L., et al. (2018). A LSTM based model for personalized context-aware citation recommendation. IEEE Access, 6, 59618–59627.CrossRefGoogle Scholar
- Zhang, Y., Yang, L., Cai, X., & Dai, H. (2018). A novel personalized citation recommendation approach based on GAN. In International symposium on methodologies for intelligent systems (pp. 268–278). Springer.Google Scholar
- Zhao, J., Mathieu, M., & LeCun, Y. J. (2016). Energy-based generative adversarial network.Google Scholar