Advertisement

GraphScSh: Efficient I/O Scheduling and Graph Sharing for Concurrent Graph Processing

  • Shang Liu
  • Zhan ShiEmail author
  • Dan Feng
  • Shuo Chen
  • Fang Wang
  • Yamei Peng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11783)

Abstract

With the increasing need for analyzing graph data, graph systems have to efficiently deal with concurrent graph processing (CGP) jobs. However, existing platforms are inherently designed for a single job, they incur the high cost when CGP jobs are executed. In this work, we observed that existing systems do not allow CGP jobs to share graph structure data of each iteration, introducing redundant accesses to same graph. Moreover, all the graphs are real-world graphs with highly skewed power-law degree distributions. The gain from extending multiple external storage devices is diminishing rapidly, which needs reasonable schedulings to balance I/O pressure into each storage. Following this direction, we propose GraphScSh that handles CGP jobs efficiently on a single machine, which focuses on reducing I/O conflict and sharing graph structure data among CGP jobs. We apply a CGP balanced partition method to break graphs into multiple partitions that are stored in multiple external storage devices. Additionally, we present a CGP I/O scheduling method, so that I/O conflict can be reduced and graph data can be shared among multiple jobs. We have implemented GraphScSh in C++ and the experiment shows that GraphScSh outperforms existing out-of-core systems by up to 82%.

Keywords

Graph processing CGP jobs Graph sharing I/O scheduling 

Notes

Acknowledgments

This work is supported by NSFC No. 61772216, 61821003, U1705261, Wuhan Application Basic Research Project No. 2017010201010103, Fund from Science, Technology and Innovation Commission of Shenzhen Municipality No. JCYJ20170307172248636, Fundamental Research Funds for the Central Universities.

References

  1. 1.
    Dasgupta, A., Hopcroft, J.E., McSherry, F.: Spectral analysis of random graphs with skewed degree distributions. In: FOCS 2004 (2004)Google Scholar
  2. 2.
  3. 3.
    Beamer, S., Asanovic, K., Patterson, D.A.: Direction-optimizing breadth-first search. In: SC 2012 (2012)Google Scholar
  4. 4.
    Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: SDM 2004 (2004)Google Scholar
  5. 5.
    Erdös, P., Rényi, A.: On random graphs. Publicationes Mathematicae Debrecen 6, 290 (1959)MathSciNetzbMATHGoogle Scholar
  6. 6.
  7. 7.
    Khayyat, Z., Awara, K., Alonazi, A., Jamjoom, H., Williams, D., Kalnis, P.: Mizan: a system for dynamic load balancing in large-scale graph processing (2013)Google Scholar
  8. 8.
    Kyrola, A., Blelloch, G., Guestrin, C.: GraphChi: large-scale graph computation on just a PC. In: OSDI 2012 (2012)Google Scholar
  9. 9.
    Liu, H., Huang, H.H.: Graphene: fine-grained IO management for graph computing. In: FAST 2017 (2017)Google Scholar
  10. 10.
    Maleki, S., Nguyen, D., Lenharth, A., Garzarán, M.J., Padua, D.A., Pingali, K.: DSMR: a parallel algorithm for single-source shortest path problem. In: ICS (2016)Google Scholar
  11. 11.
    Nilakant, K., Dalibard, V., Roy, A., Yoneki, E.: PrefEdge: SSD prefetcher for large-scale graph traversal (2014)Google Scholar
  12. 12.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Technical report (1999)Google Scholar
  13. 13.
    Roy, A., Mihailovic, I., Zwaenepoel, W.: X-stream: edge-centric graph processing using streaming partitions. In: SOSP 2013 (2013)Google Scholar
  14. 14.
  15. 15.
    Watts, D., Strogatz, S.: Collective dynamics of small world networks. Nature 393, 440–442 (1998)CrossRefGoogle Scholar
  16. 16.
  17. 17.
    Xue, J., Yang, Z., Qu, Z., Hou, S., Dai, Y.: Seraph: an efficient, low-cost system for concurrent graph processing. In: HPDC 2014 (2014)Google Scholar
  18. 18.
    Lin, Z., Kahng, M., Sabrin, K.Md., et al.: MMap: fast billion-scale graph computation on a pc via memory mapping. In: Big Data, pp. 159–164 (2014)Google Scholar
  19. 19.
    Zhang, Y., et al.: CGraph: a correlations-aware approach for efficient concurrent iterative graph processing. In: ATC 2018 (2018)Google Scholar
  20. 20.
    Zhu, X., Han, W., Chen, W.: GridGraph: large-scale graph processing on a single machine using 2-level hierarchical partitioning. In: ATC 2015 (2015)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  1. 1.Wuhan National Laboratory for OptoelectronicsHuazhong University of Science and TechnologyWuhanChina
  2. 2.Didi, Inc.BeijingChina

Personalised recommendations