Skip to main content

A Block-Based Edge Partitioning for Random Walks Algorithms over Large Social Graphs

  • Conference paper
  • First Online:
Book cover Web Information Systems Engineering – WISE 2016 (WISE 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10042))

Included in the following conference series:

Abstract

Recent results [5, 9, 23] prove that edge partitioning approaches (also known as vertex-cut) outperform vertex partitioning (edge-cut) approaches for computations on large and skewed graphs like social networks. These vertex-cut approaches generally avoid unbalanced computation due to the power-law degree distribution problem. However, these methods, like evenly random assigning [23] or greedy assignment strategy [9], are generic and do not consider any computation pattern for specific graph algorithm. We propose in this paper a vertex-cut partitioning dedicated to random walks algorithms which takes advantage of graph topological properties. It relies on a blocks approach which captures local communities. Our split and merge algorithms allow to achieve load balancing of the workers and to maintain it dynamically. Our experiments illustrate the benefit of our partitioning since it significantly reduce the communication cost when performing random walks-based algorithms compared with existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://snap.stanford.edu/data/index.html.

  2. 2.

    See details and implementations at http://spark.apache.org/docs/latest/api/scala/index.html.

References

  1. Andersen, R., Chung, F., Lang, K.: Local graph partitioning using PageRank vectors. In: FOCS, pp. 475–486 (2006)

    Google Scholar 

  2. Apache. Giraph. http://giraph.apache.org

  3. Bahmani, B., Chakrabarti, K., Xin, D.: Fast personalized PageRank on MapReduce. In: SIGMOD, pp. 973–984 (2011)

    Google Scholar 

  4. Bahmani, B., Chowdhury, A., Goel, A.: Fast incremental and personalized PageRank. Proc. VLDB Endow. 4(3), 173–184 (2010)

    Article  Google Scholar 

  5. Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition. In: SIGKDD, pp. 1456–1465 (2014)

    Google Scholar 

  6. Chierichetti, F., Kumar, R., Lattanzi, S., Mitzenmacher, M., Panconesi, A., Raghavan, P.: On compressing social networks. In: SIGKDD, pp. 219–228 (2009)

    Google Scholar 

  7. Dahimene, R., Constantin, C., du Mouza, C.: RecLand: a recommender system for social networks. In: CIKM, pp. 2063–2065 (2014)

    Google Scholar 

  8. Gleich, D.F., Seshadhri, C.: Vertex neighborhoods, low conductance cuts, and good seeds for local community methods. In: SIGKDD, pp. 597–605 (2012)

    Google Scholar 

  9. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30 (2012)

    Google Scholar 

  10. Jeh, G., Widom, J.: Scaling personalized web search. In: WWW, pp. 271–279 (2003)

    Google Scholar 

  11. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Techn. J. 49(2), 291–307 (1970)

    Article  MATH  Google Scholar 

  13. Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6, 29–123 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  14. Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. VLDB Endow. 5(8), 716–727 (2012)

    Article  Google Scholar 

  15. Lubos Takac, M.Z.: Data analysis in public social networks. In: Present Day Trends of Innovations, pp. 1–6 (2012)

    Google Scholar 

  16. Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146 (2010)

    Google Scholar 

  17. Newman, M., Barabasi, A.-L., Watts, D.J., Structure, T.: Dynamics of Networks: (Princeton Studies in Complexity). Princeton University Press, Princeton (2006)

    Google Scholar 

  18. Roy, A., Bindschaedler, L., Malicevic, J., Zwaenepoel, W.: Chaos: scale-out graph processing from secondary storage. In: SOSP, pp. 410–424 (2015)

    Google Scholar 

  19. Salihoglu, S., Widom, J.: GPS: a graph processing system. In: SSDBM, pp. 22:1–22:12 (2013)

    Google Scholar 

  20. Sarkar, P., Moore, A.W.: Fast nearest-neighbor search in disk-resident graphs. In: SIGKDD, pp. 513–522 (2010)

    Google Scholar 

  21. Valiant, L.G.: A bridging model for multi-core computing. J. Comput. Syst. Sci. 77(1), 154–166 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  22. Whang, J.J., Gleich, D.F., Dhillon, I.S.: Overlapping community detection using seed set expansion. In: CIKM, pp. 2099–2108 (2013)

    Google Scholar 

  23. Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on spark. In: GRADES, pp. 1–6 (2013)

    Google Scholar 

  24. Yang, S., Yan, X., Zong, B., Khan, A.: Towards effective partition management for large graphs. In: SIGMOD, pp. 517–528 (2012)

    Google Scholar 

  25. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI, p. 2 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cedric du Mouza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Li, Y., Constantin, C., du Mouza, C. (2016). A Block-Based Edge Partitioning for Random Walks Algorithms over Large Social Graphs. In: Cellary, W., Mokbel, M., Wang, J., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2016. WISE 2016. Lecture Notes in Computer Science(), vol 10042. Springer, Cham. https://doi.org/10.1007/978-3-319-48743-4_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48743-4_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48742-7

  • Online ISBN: 978-3-319-48743-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics