Think Sequential, Run Parallel

Fan, Wenfei; Liu, Muyang; Xu, Ruiqi; Hou, Lei; Li, Dongze; Meng, Zizhong

doi:10.1007/978-3-030-01461-2_1

Wenfei Fan^16,17,
Muyang Liu¹⁷,
Ruiqi Xu¹⁶,
Lei Hou¹⁷,
Dongze Li¹⁷ &
…
Zizhong Meng¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11180))

581 Accesses
2 Citations

Abstract

Parallel computation is often a must when processing large-scale graphs. However, it is nontrivial to write parallel graph algorithms with correctness guarantees. This paper presents the programming model of \(\mathsf {GRAPE}\), a parallel GRAPh Engine [19]. \(\mathsf {GRAPE}\) allows users to “plug in” sequential (single-machine) graph algorithms as a whole, and it parallelizes the algorithms across a cluster of processors. In other words, it simplifies parallel programming for graph computations, from think parallel to think sequential. Under a monotonic condition, it guarantees to converge at correct answers as long as the sequential algorithms are correct. We present the foundation underlying \(\mathsf {GRAPE}\), based on simultaneous fixpoint computation. As examples, we demonstrate how \(\mathsf {GRAPE}\) parallelizes our familiar sequential graph algorithms. Furthermore, we show that in addition to its programming simplicity, \(\mathsf {GRAPE}\) achieves performance comparable to the state-of-the-art graph systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
\({\mathsf {GraphLab}} _{{\mathsf {sync}}}\) and \({\mathsf {GraphLab}} _{{\mathsf {async}}}\) run different modes of \(\mathsf {GraphLab}\) (PowerGraph).

References

DBpedia. http://wiki.dbpedia.org/Datasets
Friendster. https://snap.stanford.edu/data/com-Friendster.html
Giraph. http://giraph.apache.org/
Traffic. http://www.dis.uniroma1.it/challenge9/download.shtml
UKWeb. http://law.di.unimi.it/webdata/uk-union-2006-06-2007-05/, 2006
Acar, U.A.: Self-adjusting computation. Ph.D thesis, CMU (2005)
Google Scholar
Andreev, K., Racke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006)
Article MathSciNet Google Scholar
Bader, D.A., Cong, G.: Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs. J. Parallel Distrib. Comput. 66(11), 1366–1378 (2006)
Article Google Scholar
Bang-Jensen, J., Gutin, G.Z.: Digraphs: Theory, Algorithms and Applications. Springer, Berlin (2008)
Google Scholar
Baudet, G.M.: Asynchronous iterative methods for multiprocessors. J. ACM 25(2), 226–244 (1978)
Article MathSciNet Google Scholar
Bertsekas, D.P.: Distributed asynchronous computation of fixed points. Math. Program. 27(1), 107–120 (1983)
Article MathSciNet Google Scholar
Chazan, D., Miranker, W.: Chaotic relaxation. Linear Algebr. Appl. 2(2), 199–222 (1969)
Article MathSciNet Google Scholar
Dean, J., Ghemawat, S.: MapReduce: Simplified data processing on large clusters. Commun. ACM 51(1) (2008)
Article Google Scholar
Fan, W., Hu, C., Tian, C.: Incremental graph computations: doable and undoable. In: SIGMOD (2017)
Google Scholar
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractability to polynomial time. In: PVLDB (2010)
Google Scholar
Fan, W., Wang, X., Wu, Y.: Incremental graph pattern matching. TODS 38(3) (2013)
Article MathSciNet Google Scholar
Fan, W., Wang, X., Wu, Y., Xu, J.: Association rules with graph patterns. PVLDB 8(12), 1502–1513 (2015)
Google Scholar
Fan, W., Xu, J., Wu, Y., Yu, W., Jiang, J.: GRAPE: parallelizing sequential graph computations. PVLDB 10(12), 1889–1892 (2017)
Google Scholar
Fan, W., et al.: Parallelizing sequential graph computations. In: SIGMOD (2017)
Google Scholar
Fredman, M.L., Tarjan, R.E.: Fibonacci heaps and their uses in improved network optimization algorithms. JACM 34(3), 596–615 (1987)
Article MathSciNet Google Scholar
Gallager, R.G., Humblet, P.A., Spira, P.M.: A distributed algorithm for minimum-weight spanning trees. TOPLAS 5(1), 66–77 (1983)
Article Google Scholar
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: USENIX (2012)
Google Scholar
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: OSDI (2014)
Google Scholar
Grujic, I., Bogdanovic-Dinic, S., Stoimenov, L.: Collecting and analyzing data from E-Government Facebook pages. In: ICT Innovations (2014)
Google Scholar
Han, M., Daudjee, K.: Giraph unchained: barrierless asynchronous parallel execution in pregel-like graph processing systems. PVLDB 8(9), 950–961 (2015)
Google Scholar
Han, M., Daudjee, K., Ammar, K., Ozsu, M.T., Wang, X., Jin, T.: An experimental comparison of Pregel-like graph processing systems. VLDB 7(12) (2014)
Article Google Scholar
Henzinger, M.R., Henzinger, T., Kopke, P.: Computing simulations on finite and infinite graphs. In: FOCS (1995)
Google Scholar
Ho, Q., et al.: More effective distributed ML via a stale synchronous parallel parameter server. In: NIPS, pp. 1223–1231 (2013)
Google Scholar
Jones, N.D.: An introduction to partial evaluation. ACM Comput. Surv. 28(3) (1996)
Article Google Scholar
Kim, M., Candan, K.S.: SBV-Cut: vertex-cut based graph partitioning using structural balance vertices. Data Knowl. Eng. 72, 285–303 (2012)
Article Google Scholar
Li, M., et al.: Parameter server for distributed machine learning. In: NIPS Workshop on Big Learning (2013)
Google Scholar
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. PVLDB 5(8) (2012)
Article Google Scholar
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: SIGMOD (2010)
Google Scholar
McSherry, F., Isard, M., Murray, D.G.: Scalability! but at what cost? In: HotOS (2015)
Google Scholar
Nesetril, J., Milková, E., Nesetrilová, H.: Otakar boruvka on minimum spanning tree problem. Discret. Math. 233(1–3), 3–36 (2001)
Article Google Scholar
Pingali, K., et al.: The tao of parallelism in algorithms. In: PLDI (2011)
Google Scholar
Prim, R.C.: Shortest connection networks and some generalizations. Bell Syst. Tech. J. 36(6) (1957)
Article Google Scholar
Radoi, C., Fink, S.J., Rabbah, R.M., Sridharan, M.: Translating imperative code to mapreduce. In: OOPSLA (2014)
Google Scholar
Ramalingam, G., Reps, T.: An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms 21(2), 267–305 (1996)
Article MathSciNet Google Scholar
Ramalingam, G., Reps, T.: On the computational complexity of dynamic graph problems. TCS 158(1–2) (1996)
Article MathSciNet Google Scholar
Raychev, V., Musuvathi, M., Mytkowicz, T.: Parallelizing user-defined aggregations using symbolic execution. In: SOSP (2015)
Google Scholar
Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: SIGMOD (2013)
Google Scholar
Slota, G.M., Rajamanickam, S., Devine, K., Madduri, K.: Partitioning trillion-edge graphs in minutes. In: IPDPS (2017)
Google Scholar
Tian, Y., Balmin, A., Corsten, S.A., Shirish Tatikonda, J.M.: From “think like a vertex” to “think like a graph”. PVLDB 7(7), 193–204 (2013)
Article Google Scholar
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Article Google Scholar
Valiant, L.G.: General purpose parallel architectures. Handbook of Theoretical Computer Science, vol. A (1990)
Chapter Google Scholar
Wang, G., Xie, W., Demers, A.J., Gehrke, J.: Asynchronous large-scale graph processing made easy. In: CIDR (2013)
Google Scholar
Xie, C., Yan, L., Li, W.-J., Zhang, Z.: Distributed power-law graph computing: theoretical and empirical analysis. In: NIPS (2014)
Google Scholar
Xing, E.P., Ho, Q., Dai, W., Kim, J.K., Wei, J., Lee, S., Zheng, X., Xie, P., Kumar, A., Petuum, YYu.: A new platform for distributed machine learning on big data. IEEE Trans. Big Data 1(2), 49–67 (2015)
Article Google Scholar
Yan, D., Cheng, J., Lu, Y., Ng, W.: Blogel: a block-centric framework for distributed computation on real-world graphs. PVLDB 7(14), 1981–1992 (2014)
Google Scholar
Zhou, Y., Liu, L., Lee, K., Pu, C., Zhang, Q.: Fast iterative graph computation with resource aware graph parallel abstractions. In: HPDC (2015)
Google Scholar

Download references

Acknowledgments

The paper is a tribute to Professor Chaochen Zhou, who took Fan as an MSc student 30 years ago, despite pressure from a powerful person, whom Fan confronted to get justice done for his late former MSc adviser. The authors are supported in part by 973 Program 2014CB340302, ERC 652976, EPSRC EP/M025268/1, NSFC 61421003, Beijing Advanced Innovation Center for Big Data and Brain Computing, Shenzhen Peacock Program 1105100030834361, and Joint Research Lab between Edinburgh and Huawei.

Author information

Authors and Affiliations

University of Edinburgh, Edinburgh, UK
Wenfei Fan & Ruiqi Xu
Beihang University, Beijing, China
Wenfei Fan, Muyang Liu, Lei Hou, Dongze Li & Zizhong Meng

Authors

Wenfei Fan
View author publications
You can also search for this author in PubMed Google Scholar
Muyang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ruiqi Xu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Hou
View author publications
You can also search for this author in PubMed Google Scholar
Dongze Li
View author publications
You can also search for this author in PubMed Google Scholar
Zizhong Meng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wenfei Fan , Lei Hou , Dongze Li or Zizhong Meng .

Editor information

Editors and Affiliations

Newcastle University, Newcastle, UK
Cliff Jones
National University of Defense Technology, Changsha, China
Ji Wang
Institute of Software, CAS, Beijing, China
Naijun Zhan

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fan, W., Liu, M., Xu, R., Hou, L., Li, D., Meng, Z. (2018). Think Sequential, Run Parallel. In: Jones, C., Wang, J., Zhan, N. (eds) Symposium on Real-Time and Hybrid Systems. Lecture Notes in Computer Science(), vol 11180. Springer, Cham. https://doi.org/10.1007/978-3-030-01461-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-01461-2_1
Published: 29 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01460-5
Online ISBN: 978-3-030-01461-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics