Skip to main content

A Vertex-Centric Graph Simulation Algorithm for Large Graphs

  • Conference paper
  • First Online:
Big Data (Big Data 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 945))

Included in the following conference series:

Abstract

Graph simulation as a well studied model of graph pattern matching problem, has been adopted to reduce the complexity and meet the need of novel applications such as mining potential associations between users in online social networks. In recent years, graph processing frameworks such as Pregel bring in a vertex-centric, Bulk Synchronous Parallel (BSP) programming model for processing massive data graphs and achieve encouraging results. However, developing efficient vertex-centric algorithms for graph simulation model is very challenging, because this problem does not naturally align with a vertex-centric programming model. This paper presents novel distributed algorithms based on the vertex-centric programming model for graph simulation. At the same time, considering the enormous cost of the message passing and the algorithm complexity of the pattern matching in the processing of the massive data graph, the part of message passing in the algorithm is optimized to reduce the communication cost. We experimentally verify the effectiveness and efficiency of these algorithms, using real-life massive data graph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://snap.stanford.edu/data/amazon-meta.html.

References

  1. The graph-tool python library. http://figshare.com/articles/graph_tool/1164194

  2. Brynielsson, J., Högberg, J., Kaati, L., Mårtenson, C., Svenson, P.: Detecting social positions using simulation. In: 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 48–55. IEEE (2010)

    Google Scholar 

  3. Cohen, J.: Graph twiddling in a mapreduce world. Comput. Sci. Eng. 11(4), 29–41 (2009)

    Article  Google Scholar 

  4. Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  5. Fan, W., Li, J., Ma, S., Tang, N., Wu, Y.: Adding regular expressions to graph reachability and pattern queries. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 39–50. IEEE (2011)

    Google Scholar 

  6. Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. Proc. VLDB Endow. 3(1–2), 264–275 (2010)

    Article  Google Scholar 

  7. Fard, A., Nisar, M.U., Ramaswamy, L., Miller, J.A., Saltz, M.: A distributed vertex-centric approach for pattern matching in massive graphs. In: 2013 IEEE International Conference on Big Data, pp. 403–411. IEEE (2013)

    Google Scholar 

  8. Gallagher, B.: Matching structure and semantics: a survey on graph-based pattern matching. AAAI FS 6, 45–53 (2006)

    Google Scholar 

  9. Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: OSDI, vol. 14, pp. 599–613 (2014)

    Google Scholar 

  10. He, H., Singh, A.K.: Graphs-at-a-time: query language and access methods for graph databases. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 405–418. ACM (2008)

    Google Scholar 

  11. Henzinger, M.R., Henzinger, T.A., Kopke, P.W.: Computing simulations on finite and infinite graphs. In: 1995 Proceedings of 36th Annual Symposium on Foundations of Computer Science, pp. 453–462. IEEE (1995)

    Google Scholar 

  12. Hosoya, H.: Matching and symmetry of graphs. In: Symmetry, pp. 271–290. Elsevier (1986)

    Google Scholar 

  13. Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood based fast graph search in large networks. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp. 901–912. ACM (2011)

    Google Scholar 

  14. Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 872–881. ACM (2006)

    Google Scholar 

  15. Low, Y., Gonzalez, J.E., Kyrola, A., Bickson, D., Guestrin, C.E., Hellerstein, J.: GraphLab: a new framework for parallel machine learning. arXiv preprint arXiv:1408.2041 (2014)

  16. Ma, S., Cao, Y., Fan, W., Huai, J., Wo, T.: Strong simulation: capturing topology in graph pattern matching. ACM Trans. Database Syst. (TODS) 39(1), 4 (2014)

    Article  MathSciNet  Google Scholar 

  17. Ma, S., Cao, Y., Huai, J., Wo, T.: Distributed graph pattern matching. In: Proceedings of the 21st International Conference on World Wide Web, pp. 949–958. ACM (2012)

    Google Scholar 

  18. Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)

    Google Scholar 

  19. Martínez, C., Valiente, G.: An algorithm for graph pattern-matching. In: Proceedings of Fourth South American Workshop on String Processing, vol. 8, pp. 180–197 (1997)

    Google Scholar 

  20. Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management, p. 22. ACM (2013)

    Google Scholar 

  21. Schelter, S.: Large scale graph processing with apache giraph. Invited talk at GameDuell Berlin, 29 May 2012

    Google Scholar 

  22. Tian, Y., Patel, J.M.: Tale: a tool for approximate large graph matching. In: 2008 IEEE 24th International Conference on Data Engineering, ICDE 2008, pp. 963–972. IEEE (2008)

    Google Scholar 

  23. Tong, H., Faloutsos, C., Gallagher, B., Eliassi-Rad, T.: Fast best-effort pattern matching in large attributed graphs. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 737–746. ACM (2007)

    Google Scholar 

  24. Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM (JACM) 23(1), 31–42 (1976)

    Article  MathSciNet  Google Scholar 

  25. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)

    Article  Google Scholar 

  26. Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 335–346. ACM (2004)

    Google Scholar 

  27. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, p. 10. USENIX Association, Berkeley (2010). http://dl.acm.org/citation.cfm?id=1863103.1863113

  28. Zhao, P., Han, J.: On graph query optimization in large networks. Proc. VLDB Endow. 3(1–2), 340–351 (2010)

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the financial support from the following foundations: National Key R&D Program of China (No. 2017YFC0803700), National Natural Science Foundation of China (61562091), Natural Science Foundation of Yunnan Province (2016FB110).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jin Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Li, J., Wang, X. (2018). A Vertex-Centric Graph Simulation Algorithm for Large Graphs. In: Xu, Z., Gao, X., Miao, Q., Zhang, Y., Bu, J. (eds) Big Data. Big Data 2018. Communications in Computer and Information Science, vol 945. Springer, Singapore. https://doi.org/10.1007/978-981-13-2922-7_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-2922-7_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-2921-0

  • Online ISBN: 978-981-13-2922-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics