Skip to main content

Hypergraph Partitioning for Faster Parallel PageRank Computation

  • Conference paper
Formal Techniques for Computer Systems and Business Processes (EPEW 2005, WS-FM 2005)

Abstract

The PageRank algorithm is used by search engines such as Google to order web pages. It uses an iterative numerical method to compute the maximal eigenvector of a transition matrix derived from the web’s hyperlink structure and a user-centred model of web-surfing behaviour. As the web has expanded and as demand for user-tailored web page ordering metrics has grown, scalable parallel computation of PageRank has become a focus of considerable research effort.

In this paper, we seek a scalable problem decomposition for parallel PageRank computation, through the use of state-of-the-art hypergraph-based partitioning schemes. These have not been previously applied in this context. We consider both one and two-dimensional hypergraph decomposition models. Exploiting the recent availability of the Parkway 2.1 parallel hypergraph partitioner, we present empirical results on a gigabit PC cluster for three publicly available web graphs. Our results show that hypergraph-based partitioning substantially reduces communication volume over conventional partitioning schemes (by up to three orders of magnitude), while still maintaining computational load balance. They also show a halving of the per-iteration runtime cost when compared to the most effective alternative approach used to date.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wolff, R.: Stochastic Modeling and the Theory of Queues. Prentice-Hall International Editions, Englewood Cliffs (1989)

    Google Scholar 

  2. Haveliwala, T.H.: Topic sensitive PageRank: A context-sensitive ranking algorithm for web search. Tech. Rep., Stanford University (March 2003)

    Google Scholar 

  3. Alpert, C., Huang, J.-H., Kahng, A.: Recent Directions in Netlist Partitioning. Integration, the VLSI Journal 19(1–2), 1–81 (1995)

    Article  MATH  Google Scholar 

  4. Catalyurek, U.V., Aykanat, C.: Hypergraph-partitioning-based decomposition for parallel sparse matrix–vector multiplication. IEEE Transactions on Parallel and Distributed Systems 10(7), 673–693 (1999)

    Article  Google Scholar 

  5. Vastenhouw, B., Bisseling, R.H.: A Two-Dimensional Data Distribution Method for Parallel Sparse Matrix-Vector Multiplication. SIAM Review 47(1), 67–95 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  6. Trifunovic, A., Knottenbelt, W.J.: Parkway2.0: A Parallel Multilevel Hypergraph Partitioning Tool. In: Aykanat, C., Dayar, T., Körpeoğlu, İ. (eds.) ISCIS 2004. LNCS, vol. 3280, pp. 789–800. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Boman, E., Devine, K., Heaphy, R., Catalyurek, U., Bisseling, R.: Parallel hypergraph partitioning for scientific computing. Tech. Rep. SAND05–2796C, Sandia National Laboratories, Albuquerque, NM (April 2005)

    Google Scholar 

  8. Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Twelfth International World Wide Web Conference, Budapest, Hungary, May 2003, pp. 261–270. ACM, New York (2003)

    Chapter  Google Scholar 

  9. de Jager, D.: PageRank: Three distributed algorithms. M.Sc. thesis, Department of Computing, Imperial College London, London SW7 2BZ, UK (September 2004)

    Google Scholar 

  10. Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Mathematics 1(3), 335–400 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  11. Haveliwala, T.H., Kamvar, S.D.: The second eigenvalue of the google matrix. Tech. Rep., Computational Mathematics, Stanford University (March 2003)

    Google Scholar 

  12. Google (June 20, 2005), http://www.google.com/

  13. Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Exploiting the block structure of the web for computing PageRank. In: Stanford database group tech. rep., Computational Mathematics, March 2003, Stanford University (2003)

    Google Scholar 

  14. Gleich, D., Zhukov, L., Berkhin, P.: Fast parallel PageRank: A linear system approach. Tech. Rep., Institute for Computation and Mathematical Engineering, Stanford University (2004)

    Google Scholar 

  15. Catalyurek, U.V., Aykanat, C.: A Fine-Grain Hypergraph Model for 2D Decomposition of Sparse Matrices. In: Proc. 8th International Workshop on Solving Irregularly Structured Problems in Parallel, San Francisco, USA (April 2001)

    Google Scholar 

  16. Ucar, B., Aykanat, C.: Encapsulating Multiple Communication-Cost Metrics in Partitioning Sparse Rectangular Matrices for Parallel Matrix-Vector Multiples. SIAM Journal of Scientific Computing 25(6), 1837–1859 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  17. Hendrickson, B.A.: Graph partitioning and parallel solvers: Has the Emperor no clothes. In: Ferreira, A., Rolim, J.D.P., Teng, S.-H. (eds.) IRREGULAR 1998. LNCS, vol. 1457, pp. 218–225. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  18. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Co., New York (1979)

    MATH  Google Scholar 

  19. Trifunovic, A., Knottenbelt, W.: A Parallel Algorithm for Multilevel k-way Hypergraph Partitioning. In: Proc. 3rd International Symposium on Parallel and Distributed Computing, July 2004, pp. 114–121. University College Cork, Ireland (2004)

    Google Scholar 

  20. Davis, T.: University of Florida Sparse Matrix Collection (March 2005), http://www.cise.ufl.edu/research/sparse/matrices

  21. UbiCrawler project, http://webgraph-data.dsi.unimi.it/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bradley, J.T., de Jager, D.V., Knottenbelt, W.J., Trifunović, A. (2005). Hypergraph Partitioning for Faster Parallel PageRank Computation. In: Bravetti, M., Kloul, L., Zavattaro, G. (eds) Formal Techniques for Computer Systems and Business Processes. EPEW WS-FM 2005 2005. Lecture Notes in Computer Science, vol 3670. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11549970_12

Download citation

  • DOI: https://doi.org/10.1007/11549970_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28701-8

  • Online ISBN: 978-3-540-31903-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics