Skip to main content

Pregel-Like Systems

  • Chapter
  • First Online:
Systems for Big Graph Analytics

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

  • 895 Accesses

Abstract

We first introduce Google’s Pregel, a framework for distributed graph computation that pioneered the idea of vertex-centric programming. In the vertex-centric model, a user writes a distributed graph algorithm simply by specifying the behavior of a generic vertex. In recent years, many Pregel-like systems have been developed, and we will investigate how these systems improve on the basic framework of Pregel. The background knowledge to be introduced in this chapter lays the foundation of hands-on experiencing (in the next chapter) of our Pregel-like system implementation, BigGraph@CUHK, that is both highly efficient, and very suitable for educational and research purposes due to its neat design.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Writing to a DFS generates network overhead since each data block are replicated to multiple machines to tolerate machine failures.

  2. 2.

    http://giraph.apache.org/.

  3. 3.

    http://hadoop.apache.org/.

  4. 4.

    https://turi.com/.

  5. 5.

    http://www.cse.cuhk.edu.hk/systems/graph/.

  6. 6.

    http://giraph.apache.org/ooc.html.

References

  1. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the Seventh International World-Wide Web Conference (WWW), pages 107–117, 1998.

    Google Scholar 

  2. Y. Bu, V. R. Borkar, J. Jia, M. J. Carey, and T. Condie. Pregelix: Big(ger) graph analytics on a dataflow engine. PVLDB, 8(2):161–172, 2014.

    Google Scholar 

  3. Y. Bu, V. R. Borkar, G. H. Xu, and M. J. Carey. A bloat-aware design for big data applications. In ISMM, pages 119–130, 2013.

    Google Scholar 

  4. K. M. Chandy and L. Lamport. Distributed snapshots: Determining global states of distributed systems. ACM Trans. Comput. Syst., 3(1):63–75, 1985.

    Article  Google Scholar 

  5. A. Ching, S. Edunov, M. Kabiljo, D. Logothetis, and S. Muthukrishnan. One trillion edges: Graph processing at facebook-scale. PVLDB, 8(12):1804–1815, 2015.

    Google Scholar 

  6. J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI, pages 137–150, 2004.

    Google Scholar 

  7. E. N. M. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message-passing systems. ACM Comput. Surv., 34(3):375–408, Sept. 2002.

    Google Scholar 

  8. S. Ghemawat, H. Gobioff, and S. Leung. The Google file system. In SOSP, pages 29–43, 2003.

    Google Scholar 

  9. J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, pages 17–30, 2012.

    Google Scholar 

  10. J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. Graphx: Graph processing in a distributed dataflow framework. In OSDI, pages 599–613, 2014.

    Google Scholar 

  11. U. Kang, C. E. Tsourakakis, and C. Faloutsos. PEGASUS: A peta-scale graph mining system. In ICDM, pages 229–238, 2009.

    Google Scholar 

  12. Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. Mizan: a system for dynamic load balancing in large-scale graph processing. In EuroSys, pages 169–182, 2013.

    Google Scholar 

  13. A. Kyrola, G. E. Blelloch, and C. Guestrin. GraphChi: Large-scale graph computation on just a PC. In OSDI, pages 31–46, 2012.

    Google Scholar 

  14. J. Lin and M. Schatz. Design patterns for efficient graph algorithms in mapreduce. In MLG, pages 78–85. ACM, 2010.

    Google Scholar 

  15. Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8):716–727, 2012.

    Google Scholar 

  16. G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD Conference, pages 135–146, 2010.

    Google Scholar 

  17. L. Quick, P. Wilkinson, and D. Hardcastle. Using pregel-like large scale graph processing frameworks for social network analysis. In ASONAM, pages 457–463, 2012.

    Google Scholar 

  18. A. Roy, L. Bindschaedler, J. Malicevic, and W. Zwaenepoel. Chaos: scale-out graph processing from secondary storage. In SOSP, pages 410–424, 2015.

    Google Scholar 

  19. A. Roy, I. Mihailovic, and W. Zwaenepoel. X-stream: edge-centric graph processing using streaming partitions. In SOSP, pages 472–488, 2013.

    Google Scholar 

  20. S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, pages 22:1–22:12, 2013.

    Google Scholar 

  21. S. Salihoglu and J. Widom. Optimizing graph algorithms on pregel-like systems. PVLDB, 7(7):577–588, 2014.

    Google Scholar 

  22. S. Schelter, S. Ewen, K. Tzoumas, and V. Markl. “All roads lead to Rome”: optimistic recovery for distributed iterative data processing. In CIKM, pages 1919–1928, 2013.

    Google Scholar 

  23. Z. Shang and J. X. Yu. Catch the wind: Graph workload balancing on cloud. In ICDE, pages 553–564, 2013.

    Google Scholar 

  24. Y. Shao, B. Cui, and L. Ma. PAGE: A partition aware engine for parallel graph computation. IEEE Trans. Knowl. Data Eng., 27(2):518–530, 2015.

    Article  Google Scholar 

  25. Y. Shen, G. Chen, H. V. Jagadish, W. Lu, B. C. Ooi, and B. M. Tudor. Fast failure recovery in distributed graph processing systems. PVLDB, 8(4):437–448, 2014.

    Google Scholar 

  26. Y. Shiloach and U. Vishkin. An o(log n) parallel connectivity algorithm. J. Algorithms, 3(1):57–67, 1982.

    Article  MathSciNet  MATH  Google Scholar 

  27. P. Wang, K. Zhang, R. Chen, H. Chen, and H. Guan. Replication-based fault-tolerance for large-scale graph processing. In DSN, pages 562–573, 2014.

    Google Scholar 

  28. D. Yan, Y. Bu, Y. Tian, and A. Deshpande. Big graph analytics platforms. Foundations and Trends in Databases, 7(1–2):1–195, 2017.

    Article  Google Scholar 

  29. D. Yan, J. Cheng, Y. Lu, and W. Ng. Effective techniques for message reduction and load balancing in distributed graph computation. In WWW, pages 1307–1317, 2015.

    Google Scholar 

  30. D. Yan, J. Cheng, M. T. Özsu, F. Yang, Y. Lu, J. C. S. Lui, Q. Zhang, and W. Ng. A general-purpose query-centric framework for querying big graphs. PVLDB, 9(7):564–575, 2016.

    Google Scholar 

  31. D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu. Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB, 7(14):1821–1832, 2014.

    Google Scholar 

  32. D. Yan, J. Cheng, and F. Yang. Lightweight fault tolerance in large-scale distributed graph processing. CoRR, abs/1601.06496, 2016.

    Google Scholar 

  33. D. Yan, Y. Huang, J. Cheng, and H. Wu. Efficient processing of very large graphs in a small cluster. CoRR, abs/1601.05590, 2016.

    Google Scholar 

  34. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauly, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI, pages 15–28, 2012.

    Google Scholar 

  35. C. Zhou, J. Gao, B. Sun, and J. X. Yu. Mocgraph: Scalable distributed graph processing using message online computing. PVLDB, 8(4):377–388, 2014.

    Google Scholar 

  36. X. Zhu, W. Han, and W. Chen. Gridgraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC, pages 375–386, 2015.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2017 The Author(s)

About this chapter

Cite this chapter

Yan, D., Tian, Y., Cheng, J. (2017). Pregel-Like Systems. In: Systems for Big Graph Analytics. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-58217-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58217-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58216-0

  • Online ISBN: 978-3-319-58217-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics