Skip to main content

Shortest Path Computation over Disk-Resident Large Graphs Based on Extended Bulk Synchronous Parallel Methods

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7826))

Abstract

The Single Source Shortest Path (SSSP) computation over large graphs has raised significant challenges to the memory capacity and processing efficiency. Utilizing disk-based parallel iterative computing is an economic solution. However, costs of disk I/O and communication affect the performance heavily. This paper proposes a state-transition model for SSSP and then designs two optimization strategies based on it. First, we introduce a tunable hash index to reduce the scale of wasteful data loaded from the disk. Second, we propose a new iterative mechanism and design an Across-step Message Pruning (ASMP) policy to deal with the communication bottleneck. The experimental results illustrate that our SSSP computation is 2 times faster than a basic Giraph (a memory-resident parallel framework) implementation. Compared with Hadoop and Hama (disk-resident parallel frameworks), the speedup is 21 to 43.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gao, J., Jin, R.M., Zhou, J.S., et al.: Relational Approach for Shortest Path Discovery over Large Graphs. PVLDB 5(4), 358–369 (2012)

    Google Scholar 

  2. Malewicz, G., Austern, M.H., Bik, A.J.C., et al.: Pregel: A System for Large-Scale Graph Processing. In: Proc. of SIGMOD, pp. 135–146 (2010)

    Google Scholar 

  3. Apache Incubator Giraph, http://incubator.apache.org/giraph/

  4. Ewen, S., Tzoumas, K., Kaufmann, M., et al.: Spinning Fast Iterative Data Flows. PVLDB 5(11), 1268–1279 (2012)

    Google Scholar 

  5. Apache Hama, http://hama.apache.org/

  6. Meyer, U., Sanders, P.: Δ-Stepping: A Parallel Single Source Shortest Path Algorithm. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 393–404. Springer, Heidelberg (1998)

    Google Scholar 

  7. Thorup, M.: Undirected Single-Source Shortest Paths with Positive Integer Wights in Linear Time. JACM 46(3), 362–394 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  8. Meyer, U., Osipov, V.: Design and Implementation of a Practical I/O-efficient Shortest Paths Algorithm. In: Proc. of ALENEX, pp. 85–96 (2009)

    Google Scholar 

  9. Cheng, J., Ke, Y., Chu, S., et al.: Efficient Processing of Distance Queries in Large Graphs: A Vertex Cover Approach. In: Proc. of SIGMOD, pp. 457–468 (2012)

    Google Scholar 

  10. Valiant, L.G.: A Bridging Model for Parallel Computation. Communications of the ACM 33(8), 103–111 (1990)

    Article  Google Scholar 

  11. Apache Hadoop, http://hadoop.apache.org/

  12. Fagin, R., Nievergelt, J., Pippenger, N.: Extendible Hashing - A Fast Access Method for Dynamic Files. TODS 4(3), 315–344 (1979)

    Article  Google Scholar 

  13. Litwin, W.: Linear Hashing: A New Tool for File and Table Addressing. In: Proc. of VLDB, pp. 212–223 (1980)

    Google Scholar 

  14. Xiao, Y.H., Wu, W.T., Pei, J.: Efficiently Indexing Shortest Paths by Exploiting Symmetry in Graphs. In: Proc. of EDBT, pp. 493–504 (2009)

    Google Scholar 

  15. Wei, F.: TEDI: Efficient Shortest Path Query Answering on Graphs. In: Proc. of SIGMOD, pp. 99–110 (2010)

    Google Scholar 

  16. Trinity, http://research.microsoft.com/en-us/projects/trinity/

  17. Bu, Y., Howe, B., Balazinska, M., et al.: HaLoop: Efficient Iterative Data Processing on Large Clusters. PVLDB 3(1-2), 285–296 (2010)

    Google Scholar 

  18. Twister: Iterative MapReduce, http://www.iterativemapreduce.org/

  19. SNAP: Network dataset, http://snap.stanford.edu/data/soc-LiveJournal1.html

  20. 9th DIMACS, http://www.dis.uniroma1.it/challenge9/download.shtml

  21. Using the Wikipedia link dataset, http://haselgrove.id.au/wikipedia.htm

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Z., Gu, Y., Zimmermann, R., Yu, G. (2013). Shortest Path Computation over Disk-Resident Large Graphs Based on Extended Bulk Synchronous Parallel Methods. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds) Database Systems for Advanced Applications. DASFAA 2013. Lecture Notes in Computer Science, vol 7826. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37450-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37450-0_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37449-4

  • Online ISBN: 978-3-642-37450-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics