Advertisement

GRAM: A GPU-Based Property Graph Traversal and Query for HPC Rich Metadata Management

  • Wenke Li
  • Xuanhua ShiEmail author
  • Hong Huang
  • Peng Zhao
  • Hai Jin
  • Dong Dai
  • Yong Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11276)

Abstract

In HPC systems, rich metadata are defined to describe rich information about data files, like the executions that lead to the data files, the environment variables, and the parameters of all executions, etc. Recent studies have shown the feasibility of using property graph to model rich metadata and utilizing graph traversal to query rich metadata stored in the property graph. We propose to utilize GPU to process the rich metadata graphs. There are generally two challenges to utilize GPU for metadata graph query. First, there is no proper data representation for the metadata graph on GPU yet. Second, there is no optimization techniques specifically for metadata graph traversal on GPU neither. In order to tackle these challenges, we propose GRAM, a GPU-based property graph traversal and query framework. GRAM uses GPU to express metadata graph in Compressed Sparse Row (CSR) format, and uses Structure of Arrays (SoA) layout to store properties. In addition, we propose two new optimizations, parallel filtering and basic operations merging, to accelerate the metadata graph traversal. Our evaluation results show that GRAM can be effectively applied to user scenarios in HPC systems, and the performance of metadata management is greatly improved.

Keywords

Rich metadata management Property graph Graph traversal GPU 

Notes

Acknowledgement

The work is supported by the National Key R&D Program of China (No. 2017YFC0803700), NSFC (No. 61772218, 61433019, U1435217), and the Outstanding Youth Foundation of Hubei Province (No. 2016CFA032).

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
    Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: SDM (2004)Google Scholar
  7. 7.
    Dai, D., Carns, P., Ross, R.B., Jenkins, J., Blauer, K., Chen, Y.: GraphTrek: asynchronous graph traversal for property graph-based metadata management. In: CLUSTER (2015)Google Scholar
  8. 8.
    Dai, D., Chen, Y., Carns, P., Jenkins, J., Zhang, W., Ross, R.: GraphMeta: a graph-based engine for managing large-scale HPC rich metadata. In: CLUSTER (2016)Google Scholar
  9. 9.
    Dai, D., Ross, R.B., Carns, P., Kimpe, D., Chen, Y.: Using property graphs for rich metadata management in HPC systems. In: PDSW (2014)Google Scholar
  10. 10.
    Fu, Z., Personick, M., Thompson, B.: MapGraph: a high level API for fast development of high performance graph analytics on GPUs. In: GRADES (2014)Google Scholar
  11. 11.
    Gharaibeh, A., Beltrão Costa, L., Santos-Neto, E., Ripeanu, M.: A yoke of oxen and a thousand chickens for heavy lifting graph processing. In: CLUSTER (2012)Google Scholar
  12. 12.
    Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: OSDI (2012)Google Scholar
  13. 13.
    Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: OSDI (2014)Google Scholar
  14. 14.
    Harish, P., Narayanan, P.J.: Accelerating large graph algorithms on the GPU using CUDA. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2007. LNCS, vol. 4873, pp. 197–208. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-77220-0_21CrossRefGoogle Scholar
  15. 15.
    Khetrapal, A., Ganesh, V.: HBase and Hypertable for large scale distributed storage systems. Department of Computer Science, Purdue University, pp. 22–28 (2006)Google Scholar
  16. 16.
    Khorasani, F., Vora, K., Gupta, R., Bhuyan, L.N.: CuSha: Vertex-centric graph processing on GPUs. In: HPDC (2014)Google Scholar
  17. 17.
    Kumar, P., Huang, H.H.: G-Store: high-performance graph store for trillion-edge processing. In: SC (2016)Google Scholar
  18. 18.
    Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)CrossRefGoogle Scholar
  19. 19.
    Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: SIGMOD (2010)Google Scholar
  20. 20.
    Merrill, D., Garland, M., Grimshaw, A.: Scalable GPU graph traversal. In: PPoPP (2012)Google Scholar
  21. 21.
    Muniswamy-Reddy, K.K., Holland, D.A., Braun, U., Seltzer, M.: Provenance-aware storage systems. In: USENIX ATC, pp. 43–56 (2006)Google Scholar
  22. 22.
    Tanenbaum, A.S., Bos, H.: Modern Operating System, 4th edn. Prentice-Hall, Upper Saddle River (2014)Google Scholar
  23. 23.
    Webber, J.: A programmatic introduction to Neo4j. In: SPLASH (2012)Google Scholar
  24. 24.
    Zhang, Q., Feng, D., Wang, F., Wu, S.: Mlock: building delegable metadata service for the parallel file systems. Sci. China Inf. Sci. 58, 1–14 (2015)Google Scholar
  25. 25.
    Zhao, D., Shou, C., Maliky, T., Raicu, I.: Distributed data provenance for large-scale data-intensive computing. In: CLUSTER (2013)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2018

Authors and Affiliations

  • Wenke Li
    • 1
  • Xuanhua Shi
    • 1
    Email author
  • Hong Huang
    • 1
  • Peng Zhao
    • 1
  • Hai Jin
    • 1
  • Dong Dai
    • 2
  • Yong Chen
    • 2
  1. 1.Services Computing Technology and System Lab, Big Data Technology and System LabHuazhong University of Science and TechnologyWuhanChina
  2. 2.Department of Computer ScienceTexas Tech UniversityLubbockUSA

Personalised recommendations