Skip to main content
Log in

Strongly connected components based efficient computation of page rank

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

In this paper, an efficient page rank (PR) exact algorithm is proposed, which can improve the computation efficiency without sacrificing results accuracy. The existing exact algorithms are generally based on the original power method (PM). In order to reduce the number of I/Os required to improve efficiency, they partition the big graph into multiple smaller ones that can be totally fitted in memory. The algorithm proposed in this paper can further reduce the required number of I/Os. Instead of partitioning the graph into the general subgraphs, our algorithm partitions graph into a special kind of subgraphs: SCCs (strongly connected components), the nodes in which are reachable to each other. By exploiting the property of SCC, some theories are proposed, based on which the computation iterations can be constrained on these SCC subgraphs. Our algorithm can reduce lots of I/Os and save a large amount of computations, as well as keeping the results accuracy. In a word, our algorithm is more efficient among the existing exact algorithms. The experiments demonstrate that the algorithms proposed in this paper can make an obvious efficiency improvement and can attain high accurate results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Mihalcea R, Tarau P. Textrank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2004

    Google Scholar 

  2. Page L, Brin S, Motwani R, Winograd T. The pagerank citation ranking: bringing order to the web. Stanford University Technical Report. 1999

    Google Scholar 

  3. Mitliagkas I, Borokhovich M, Dimakis A G, Caramanis C. Frogwild!: fast pagerank approximations on graph engines. Proceedings of the VLDB Endowment, 2015, 8(8): 874–885

    Article  Google Scholar 

  4. Avrachenkov K, Litvak N, Nemirovsky D, Osipova N. Monte carlo methods in pagerank computation: when one iteration is sufficient. SIAM Journal on Numerical Analysis, 2005, 45(2): 890–904

    Article  MathSciNet  MATH  Google Scholar 

  5. Zhou Y, Liu L, Lee K, Zhang Q. Graph Twist: fast iterative graph computation with two-tier optimizations. Proceedings of the VLDB Endowment, 2015, 8(11): 1262–1273

    Article  Google Scholar 

  6. Gonzalez J E, Xin R S, Dave A, Crankshaw D, Franklin M J, Stoica I. Graphx: graph processing in a distributed dataflow framework. In: Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation. 2014, 599–613

    Google Scholar 

  7. Roy A, Mihailovic I, Zwaenepoel W. X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 472–488

    Google Scholar 

  8. Haveliwala T. Efficient computation of pagerank. Stanford University Technical Report. 2002

    Google Scholar 

  9. Shao B, Wang H X, Li Y T. Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2013, 505–516

    Google Scholar 

  10. Gonzalez J E, Low Y, Gu H, Bickson D, Guestrin C. Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation. 2012, 17–30

    Google Scholar 

  11. Brinkmeier M. Distributed calculation of pagerank using strongly connected components. In: Proceedings of International Workshop on Innovative Internet Community Systems. 2005, 29–40

    Google Scholar 

  12. Engström C, Silvestrov S. A componentwise pagerank algorithm. In: Proceedings of the 16th Applied Stochastic Models and Data Analysis International Conference with Demographics 2015 Workshop. 2015, 185–198

    Google Scholar 

  13. Xie W L, Wang G Z, Bindel D, Demers A, Gehrke J. Fast iterative graph computation with block updates. Proceedings of the VLDB Endowment, 2013, 6(14): 2014–2025

    Article  Google Scholar 

  14. Kyrölä A, Blelloch G, Guestrin C. Large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation. 2012, 31–46

    Google Scholar 

  15. Kohlschütter C, Chirita P A, NejdlW. Efficient parallel computation of pagerank. In: Proceedings of the 28th European Conference on Information Retrieval. 2006, 241–252

    Google Scholar 

  16. Kamvar S D, Haveliwala T H, Manning C D, Golub G H. Extrapolation methods for accelerating pagerank computations. In: Proceedings of the 12th International Conference on World Wide Web. 2003, 261–270

    Google Scholar 

  17. Kamvar S D, Haveliwala T H, Manning C D, Golub G H. Exploiting the block structure of the web for computing pagerank. Stanford University Technical Report. 2003

    Google Scholar 

  18. Custódio A L, Rocha H, Vicente L N. Incorporating minimum frobenius norm models in direct search. Computational Optimization and Applications, 2010, 46(2): 265–278

    Article  MathSciNet  MATH  Google Scholar 

  19. Nuutila E, Soisalon-Soininen E. On finding the strongly connected components in a directed graph. Information Processing Letters, 1994, 49(1): 9–14

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongguo Yang.

Additional information

Hongguo Yang is a PhD candidate in the School of Computer Science and Engineering, Northeastern University, China. Her research interests include top-k, entity search, entity ranking and recommendation.

Derong Shen is a full professor and a PhD supervisor in the School of Computer Science and Engineering, Northeastern University, China, where she received her PhD degree in computer science in 2004. She received her BS and MS degrees in computer science from Jilin University, China in 1987 and 1990, respectively. She is a senior member of CCF, and a member of ACM and IEEE. Her interests include distributed data management and data integration.

Yue Kou is an associate professor in the School of Computer Science and Engineering, Northeastern University, China, where she received her BS degree in computer science, MS and PhD degrees in computer software and theory in 2002, 2005, and 2009, respectively. Her interests include entity search and data mining.

Tiezheng Nie is an associate professor in the School of Computer Science and Engineering, Northeastern University, China, where he received his BS degree in computer science, MS and PhD degrees in computer software and theory in 2002, 2005, and 2009, respectively. His interests include data quality and data integration.

Ge Yu is a full professor and a PhD supervisor in the College of Information Science and Engineering, Northeastern University, China, where he received his BS and MS degrees in computer science in 1982 and 1985, respectively. He received his PhD degree in computer science from Kyushu University, Japan in 1996. He is a senior member of the CCF, and a member of ACM and IEEE. His interests include databases and big data management.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, H., Shen, D., Kou, Y. et al. Strongly connected components based efficient computation of page rank. Front. Comput. Sci. 12, 1208–1219 (2018). https://doi.org/10.1007/s11704-016-6168-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-016-6168-0

Keywords

Navigation