Skip to main content

A Balanced Vertex Cut Partition Method in Distributed Graph Computing

  • Conference paper
  • First Online:
Book cover Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques (IScIDE 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9243))

Abstract

Graph computing plays an important role in mining data at large scale. Partition is the primary step when we process large graph in a distributed system. A good partition has less communication and memory cost as well as more balanced load to take advantage of the whole system. Traditional edge cut methods introduce large communication cost for realistic power law graphs. Current vertex cut methods perform poorly with little consideration on load balance especially for online streaming vertex cut partition. In this paper, we formulate the total cost (partition cost, communication cost and computing cost) of graph computing especially that in iterating algorithms and analyze the cost of current partitioning methods. In addition, we explore a novel vertex cut method to ensure lower total cost. It has more balanced load with fewer communications. Experiments show that our method outperforms in state of the art graph computing frameworks at an average of 10 percent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Holyer, I.: The np-completeness of some edge-partition problems. SIAM J. Comput. 10(4), 713–717 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  2. Finding good approximate vertex and edge partitions is np-hard. Inf. Process. Lett. 42(3), 153–159 (1992)

    Google Scholar 

  3. Zhou, J., Bruno, N., Lin, W.: Advanced partitioning techniques for massively distributed computation. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 13–24. ACM (2012)

    Google Scholar 

  4. Andreev, K., Rcke, H.: Balanced graph partitioning. In: Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 04, pp. 120–124 (2004)

    Google Scholar 

  5. Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)

    Google Scholar 

  6. R. Chen et al.: Bigraph: Bipartite-aware distributed graph partition for big learning. Institute of Parallel and Distributed Systems Technical report, Number: IPADSTR-2013-002 (2013)

    Google Scholar 

  7. Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Graphlab: A new framework for parallel machine learning. CoRR, vol. abs/1006.4990 (2010)

    Google Scholar 

  8. Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endowment 5(8), 716–727 (2012)

    Article  Google Scholar 

  9. Stanton, I., Kliot, G.: Streaming graph partitioning for large distributed graphs. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1222–1230. ACM (2012)

    Google Scholar 

  10. Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition in MSR Technical report, MSR-TR-2014-20, February 2014

    Google Scholar 

  11. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  12. Stanford large network dataset collection. http://snap.stanford.edu/data/

  13. Graph 500. http://www.graph500.org/

  14. Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: ACM SIGCOMM Computer Communication Review, vol. 29, no. 4, pp. 251–262. ACM (1999)

    Google Scholar 

  15. Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, pp. 29–42. ACM (2007)

    Google Scholar 

  16. Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS Operating Syst. Rev. 41(3), 59–72 (2007)

    Article  Google Scholar 

  17. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  18. Salihoglu, S., Widom, J.: Optimizing graph algorithms on pregel-like systems (2014)

    Google Scholar 

  19. Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. (CSUR) 40(1), 1 (2008)

    Article  Google Scholar 

  20. Chen, R. et al.: Powerlyra: Differentiated graph computation and partitioning on skewed graphs. Institute of Parallel and Distributed Systems Technical report, Number:IPADSTR-2013-001 (2013)

    Google Scholar 

Download references

Acknowledgment

This work has been supported by National High Technology Research and Development 863 Program of China under Grant No.2013AA013205 and Program of State Key Laboratory of High-end Server & Storage Technology under Grant No.2014HSSA16.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rujun Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sun, R., Zhang, L., Chen, Z., Hao, Z. (2015). A Balanced Vertex Cut Partition Method in Distributed Graph Computing. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23862-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23861-6

  • Online ISBN: 978-3-319-23862-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics