Skip to main content

GCM-Bench: A Benchmark for RDF Data Management System on Microorganism Data

  • Conference paper
  • First Online:
Book cover Big Scientific Data Benchmarks, Architecture, and Systems (SDBA 2018)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 911))

Included in the following conference series:

Abstract

The biological data is growing up to an unprecedented scale, such as microorganism knowledge graph organized by biologists, which is represented by Resource Description Framework (RDF) data model. In this paper, GCM-Bench, a new benchmark to evaluate the performance of general-purpose RDF data management systems on microorganism RDF data is proposed, which consists of microorganism RDF data generator, SPARQL query workloads and automatic test system, that can execute the testing workloads automatically and monitor the resource utilization. Five RDF data management systems are selected for evaluation on different sizes of data using automatic test system. We think GCM-Bench will help microbiologists and system developers to select their proper RDF data management system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carroll, J. J., Dickinson, I., Dollin, C., Reynolds, D., Seaborne, A., Wilkinson, K.: Jena: implementing the semantic web recommendations. In: Proceedings of the 13th International World Wide Web Conference - Alternate Track Papers & Posters, pp. 74–83. ACM, New York (2004)

    Google Scholar 

  2. Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: a generic architecture for storing and querying RDF and RDF schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-48005-6_7

    Chapter  Google Scholar 

  3. Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. Proc. VLDB Endow. 1(1), 647–659 (2008)

    Article  Google Scholar 

  4. Schätzle, A., Przyjaciel-Zablocki, M., Skilevic, S., Lausen, G.: S2RDF: RDF querying with SPARQL on spark. Proc. VLDB Endow. 9(10), 804–815 (2016)

    Article  Google Scholar 

  5. Peng, P., Zou, L., Őzsu, M.T., Chen, L., Zhao, D.: Processing SPARQL queries over distributed RDF graphs. VLDB J. 25(2), 243–268 (2016)

    Article  Google Scholar 

  6. Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. Proc. VLDB Endow. 1(1), 1008–1019 (2008)

    Article  Google Scholar 

  7. Abadi, D.J., Marcus, A., Madden, S.R., Hollenbach, K.: SW-Store: a vertically partitioned DBMS for semantic web data management. VLDB J. 18(2), 385–406 (2009)

    Article  Google Scholar 

  8. Zou, L., Őzsu, M.T., Chen, L., Shen, X., Huang, R., Zhao, D.: gStore: a graph-based SPARQL query engine. VLDB J. 23(4), 565–590 (2014)

    Article  Google Scholar 

  9. Zou, L., Mo, J., Chen, L., Őzsu, M.T., Zhao, D.: gStore: answering SPARQL queries via subgraph matching. Proc. VLDB Endow. 4(8), 482–493 (2011)

    Article  Google Scholar 

  10. Őzsu, M.T.: A survey of RDF data management systems. Front. Comput. Sci. 10(3), 418–432 (2016)

    Article  Google Scholar 

  11. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing, vol. 10, no. 10–10, p. 95. USENIX Association, Boston (2010)

    Google Scholar 

  12. Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale RDF data. Proc. VLDB Endow. 6(4), 265–276 (2013)

    Article  Google Scholar 

  13. Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 505–516. ACM, New York (2013)

    Google Scholar 

  14. Karypis, G., Kumar, V.: Analysis of multilevel graph partitioning. In: Proceedings of the 1995 ACM/IEEE Conference on Supercomputing. ACM, New York (1995)

    Google Scholar 

  15. Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. Web Semant. Sci. Serv. Agents World Wide Web 3(2–3), 158–182 (2005)

    Article  Google Scholar 

  16. Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 125–139. Springer, Heidelberg (2006). https://doi.org/10.1007/11762256_12

    Chapter  Google Scholar 

  17. Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: a SPARQL performance benchmark. In: Proceedings of the 25th International Conference on Data Engineering, pp. 222–233. IEEE Computer Society, Shanghai, China (2009)

    Google Scholar 

  18. Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13

    Chapter  Google Scholar 

  19. Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst. 5(2), 1–24 (2009)

    Article  Google Scholar 

  20. Feng, J., Meng, C., Song, J., Zhang, X., Feng, Z., Zou, L.: SPARQL query parallel processing: a survey. In: Proceedings of the 2017 IEEE International Congress on Big Data, pp. 444–451. IEEE Computer Society, Honolulu (2017)

    Google Scholar 

  21. Berners-Lee, T., Connolly, D.: Notation3 (N3): a readable RDF syntax. https://www.w3.org/TeamSubmission/n3/. Last Accessed 2 Apr 2018

  22. Wang, L., Zhan, J., Luo, C., Zhu, Y., Yang, Q., et al.: Bigdatabench: a big data benchmark suite from internet services. In: IEEE International Symposium On High Performance Computer Architecture (HPCA), pp. 488–499 (2014)

    Google Scholar 

  23. Jia, Z., Zhan, J., Wang, L., Luo, C., Gao, W., Jin, Y., et al.: Understanding big data analytics workloads on modern processors. IEEE Trans. Parallel Distrib. Syst. 28(6), 1797–1810 (2017)

    Article  Google Scholar 

  24. Gao, W., Zhan, J., Wang, L., Luo, C., Zheng, D., et al.: Data Motifs: a lens towards fully understanding big data and AI workloads. In: Parallel Architectures and Compilation Techniques (PACT). IEEE, Limassol, Cyprus (2018)

    Google Scholar 

  25. Gao, W., Zhan, J., Wang, L., Luo, C., Jia, Z., et al.: Data Motif-based proxy benchmarks for big data and AI workloads. In: 2018 IEEE International Symposium on Workload Characterization. IEEE, Raleigh (2018)

    Google Scholar 

Download references

Acknowledgment

This work is supported by the National Key Research and Development Plan of China (Grant No. 2016YFB1000600 and 2016YFB1000601).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Renfeng Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, R., Xu, J. (2019). GCM-Bench: A Benchmark for RDF Data Management System on Microorganism Data. In: Ren, R., Zheng, C., Zhan, J. (eds) Big Scientific Data Benchmarks, Architecture, and Systems. SDBA 2018. Communications in Computer and Information Science, vol 911. Springer, Singapore. https://doi.org/10.1007/978-981-13-5910-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-5910-1_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-5909-5

  • Online ISBN: 978-981-13-5910-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics