Abstract
The biological data is growing up to an unprecedented scale, such as microorganism knowledge graph organized by biologists, which is represented by Resource Description Framework (RDF) data model. In this paper, GCM-Bench, a new benchmark to evaluate the performance of general-purpose RDF data management systems on microorganism RDF data is proposed, which consists of microorganism RDF data generator, SPARQL query workloads and automatic test system, that can execute the testing workloads automatically and monitor the resource utilization. Five RDF data management systems are selected for evaluation on different sizes of data using automatic test system. We think GCM-Bench will help microbiologists and system developers to select their proper RDF data management system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Carroll, J. J., Dickinson, I., Dollin, C., Reynolds, D., Seaborne, A., Wilkinson, K.: Jena: implementing the semantic web recommendations. In: Proceedings of the 13th International World Wide Web Conference - Alternate Track Papers & Posters, pp. 74–83. ACM, New York (2004)
Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: a generic architecture for storing and querying RDF and RDF schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-48005-6_7
Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. Proc. VLDB Endow. 1(1), 647–659 (2008)
Schätzle, A., Przyjaciel-Zablocki, M., Skilevic, S., Lausen, G.: S2RDF: RDF querying with SPARQL on spark. Proc. VLDB Endow. 9(10), 804–815 (2016)
Peng, P., Zou, L., Őzsu, M.T., Chen, L., Zhao, D.: Processing SPARQL queries over distributed RDF graphs. VLDB J. 25(2), 243–268 (2016)
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. Proc. VLDB Endow. 1(1), 1008–1019 (2008)
Abadi, D.J., Marcus, A., Madden, S.R., Hollenbach, K.: SW-Store: a vertically partitioned DBMS for semantic web data management. VLDB J. 18(2), 385–406 (2009)
Zou, L., Őzsu, M.T., Chen, L., Shen, X., Huang, R., Zhao, D.: gStore: a graph-based SPARQL query engine. VLDB J. 23(4), 565–590 (2014)
Zou, L., Mo, J., Chen, L., Őzsu, M.T., Zhao, D.: gStore: answering SPARQL queries via subgraph matching. Proc. VLDB Endow. 4(8), 482–493 (2011)
Őzsu, M.T.: A survey of RDF data management systems. Front. Comput. Sci. 10(3), 418–432 (2016)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing, vol. 10, no. 10–10, p. 95. USENIX Association, Boston (2010)
Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale RDF data. Proc. VLDB Endow. 6(4), 265–276 (2013)
Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 505–516. ACM, New York (2013)
Karypis, G., Kumar, V.: Analysis of multilevel graph partitioning. In: Proceedings of the 1995 ACM/IEEE Conference on Supercomputing. ACM, New York (1995)
Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. Web Semant. Sci. Serv. Agents World Wide Web 3(2–3), 158–182 (2005)
Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 125–139. Springer, Heidelberg (2006). https://doi.org/10.1007/11762256_12
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: a SPARQL performance benchmark. In: Proceedings of the 25th International Conference on Data Engineering, pp. 222–233. IEEE Computer Society, Shanghai, China (2009)
Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13
Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst. 5(2), 1–24 (2009)
Feng, J., Meng, C., Song, J., Zhang, X., Feng, Z., Zou, L.: SPARQL query parallel processing: a survey. In: Proceedings of the 2017 IEEE International Congress on Big Data, pp. 444–451. IEEE Computer Society, Honolulu (2017)
Berners-Lee, T., Connolly, D.: Notation3 (N3): a readable RDF syntax. https://www.w3.org/TeamSubmission/n3/. Last Accessed 2 Apr 2018
Wang, L., Zhan, J., Luo, C., Zhu, Y., Yang, Q., et al.: Bigdatabench: a big data benchmark suite from internet services. In: IEEE International Symposium On High Performance Computer Architecture (HPCA), pp. 488–499 (2014)
Jia, Z., Zhan, J., Wang, L., Luo, C., Gao, W., Jin, Y., et al.: Understanding big data analytics workloads on modern processors. IEEE Trans. Parallel Distrib. Syst. 28(6), 1797–1810 (2017)
Gao, W., Zhan, J., Wang, L., Luo, C., Zheng, D., et al.: Data Motifs: a lens towards fully understanding big data and AI workloads. In: Parallel Architectures and Compilation Techniques (PACT). IEEE, Limassol, Cyprus (2018)
Gao, W., Zhan, J., Wang, L., Luo, C., Jia, Z., et al.: Data Motif-based proxy benchmarks for big data and AI workloads. In: 2018 IEEE International Symposium on Workload Characterization. IEEE, Raleigh (2018)
Acknowledgment
This work is supported by the National Key Research and Development Plan of China (Grant No. 2016YFB1000600 and 2016YFB1000601).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liu, R., Xu, J. (2019). GCM-Bench: A Benchmark for RDF Data Management System on Microorganism Data. In: Ren, R., Zheng, C., Zhan, J. (eds) Big Scientific Data Benchmarks, Architecture, and Systems. SDBA 2018. Communications in Computer and Information Science, vol 911. Springer, Singapore. https://doi.org/10.1007/978-981-13-5910-1_1
Download citation
DOI: https://doi.org/10.1007/978-981-13-5910-1_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-5909-5
Online ISBN: 978-981-13-5910-1
eBook Packages: Computer ScienceComputer Science (R0)