Abstract
Computing PageRank for enormous and frequently evolving real-world network consumes sizable resource and comes with large computational overhead. To address this problem, IMCPR, an incremental PageRank algorithm based on Monte Carlo method is proposed in this paper. IMCPR computes PageRank scores via updating previous results accumulatively according to the changed part of network, instead of recomputing from scratch. IMCPR effectively improves the performance and brings no additional storage overhead. Theoretical analysis shows that the time complexity of IMCPR to update PageRank scores for a network with m changed nodes and n changed edges is O((m+n/c)/c), where c is reset probability. It takes O(1) works to update PageRank scores as inserting/removing a node or edge. The time complexity of IMCPR is better than other existing state-of-art algorithms for most real-world graphs. We evaluate IMCPR with real-world networks from different backgrounds upon Hama, a distributed platform. Experiments demonstrate that IMCPR obtains PageRank scores with equal (or even higher) accuracy as the baseline Monte Carlo based PageRank algorithm and reduces the amount of computation significantly compared to other existing incremental algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Page, L., et al.: The PageRank citation ranking: bringing order to the web (1999)
Bahmani, B., Chowdhury, A., Goel, A.: Fast incremental and personalized pagerank. Proc. VLDB Endow. 4(3), 173–184 (2010)
Desikan, P., et al.: Incremental page rank computation on evolving graphs. In: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web. ACM (2005)
Avrachenkov, K., et al.: Monte Carlo methods in PageRank computation: when one iteration is sufficient. SIAM J. Numer. Anal. 45(2), 890–904 (2007)
Langville, A.N., Meyer, C.D.: Deeper inside pagerank. Internet Math. 1(3), 335–380 (2004)
Chien, S., et al.: Towards exploiting link evolution (2001)
Langville, A.N., Meyer. C.D.: Updating pagerank with iterative aggregation. In: Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers & Posters. ACM (2004)
Kamvar, S., et al.: Exploiting the block structure of the web for computing pagerank. Technical report, Stanford University (2003)
Lofgren, P.: On the complexity of the Monte Carlo method for incremental PageRank. Inf. Process. Lett. 114(3), 104–106 (2014)
Das Sarma, A., Molla, A.R., Pandurangan, G., Upfal, E.: Fast distributed PageRank computation. In: Frey, D., Raynal, M., Sarkar, S., Shyamasundar, Rudrapatna K., Sinha, P. (eds.) ICDCN 2013. LNCS, vol. 7730, pp. 11–26. Springer, Heidelberg (2013). doi:10.1007/978-3-642-35668-1_2
Jure, L.: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/index.html
Seo, S., et al.: HAMA: an efficient matrix computation with the mapreduce framework. In: 2010 IEEE Second International Conference on IEEE Cloud Computing Technology and Science (CloudCom) (2010)
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Jeh, G., Jennifer, W.: Scaling personalized web search. In: Proceedings of the 12th International Conference on World Wide Web. ACM (2003)
Pettie, S.: Single-source shortest paths. In: Encyclopedia of Algorithms, pp. 847–849 (2008)
Acknowledgement
We would like to thank Shan Shan for helpful suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Liao, Q., Jiang, S., Yu, M., Yang, Y., Li, T. (2017). Monte Carlo Based Incremental PageRank on Evolving Graphs. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10234. Springer, Cham. https://doi.org/10.1007/978-3-319-57454-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-57454-7_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57453-0
Online ISBN: 978-3-319-57454-7
eBook Packages: Computer ScienceComputer Science (R0)