Skip to main content

Graph DBs vs. Column-Oriented Stores: A Pure Performance Comparison

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9511))

Abstract

Cloud Computing has brought a great change in the way information is stored and applications run. In order for one or more clusters to work as a cloud we need a middleware framework, such as Apache Hadoop [17], that provides reliability, scalability and distributed computing. Once the infrastructure has been established, a software framework can be installed, which runs on top of it and will be the connection to communicate with the applications developed by the users. The software, in this regard, is a NoSQL database. This paper deals with the problem of searching data in some widespread NoSQL databases used in cloud computing. Two categories of NoSQL databases are compared; one based on columns using a column-oriented key-value store, HBase [6], and a high-available graph database, Neo4j [11]. HBase is a distributed, scalable storage system that runs on top of HDFS, and has being designed based on Google’s BigTable [4]. Neo4j has being designed and developed to be a reliable database, optimized for graph structures, instead of tables, and is a robust, scalable, high performance and high available database that supports ACID transactions and queries written in Cypher language. The aim of this paper is to create a novel system that will decide when a query must be send to be executed in a key-value store or a graph database. Thus, an experimental pure performance comparison has been made between Apache HBase and Neo4j for a variety of queries, that were programmed using systems API’s and Java language.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/kendea/dataset_movies.

  2. 2.

    https://github.com/kendea/hbase_neo4j.

References

  1. Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1), 1:1–1:39 (2008)

    Article  Google Scholar 

  2. Brewer, E.: Cap twelve years later: how the “rules” have changed. Computer 45(2), 23–29 (2012)

    Article  Google Scholar 

  3. Cai, L., Huang, S., Chen, L., Zheng, Y.: Performance analysis and testing of hbase based on its architecture. In: 2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS), pp. 353–358, June 2013

    Google Scholar 

  4. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation, pp. 205–218. OSDI 2006, USENIX Association, Berkeley, CA, USA (2006)

    Google Scholar 

  5. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)

    Article  Google Scholar 

  6. George, L.: HBase: The Definitive Guide. O’Reilly Media Inc., Sebastopol (2011)

    Google Scholar 

  7. Holzschuher, F., Peinl, R.: Performance of graph query languages: comparison of cypher, gremlin and native access in neo4j. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, EDBT 2013, NY, USA, pp. 195–204. ACM, New York (2013)

    Google Scholar 

  8. Kostylev, E.V., Reutter, J.L., Vrgoc, D.: Containment of data graph queries. In: ICDT, pp. 131–142 (2014)

    Google Scholar 

  9. Kristina, C., Michael, D.: MongoDB: The Definitive Guide. O’Reilly Media, Sebastopol (2010)

    Google Scholar 

  10. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  11. Neo4j.org: Neo4j - the world’s leading graph database. http://www.neo4j.org/, Accessed on 16 june 2014

  12. Nishimura, S., Das, S., Agrawal, D., Abbadi, A.: Md-hbase: a scalable multi-dimensional data infrastructure for location aware services. In: 2011 12th IEEE International Conference on Mobile Data Management (MDM), vol. 1, pp. 7–16, June 2011

    Google Scholar 

  13. Robinson, I., Webber, J., Eifrem, E.: Graph Databases. O’Reilly Media, Inc., Sebastopol (2013)

    Google Scholar 

  14. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10, May 2010

    Google Scholar 

  15. Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626–1629 (2009)

    Article  Google Scholar 

  16. Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th Annual Southeast Regional Conference, ACM SE 2010, NY, USA, pp. 42: 1–42: 6. ACM, New York (2010)

    Google Scholar 

  17. White, T.: Hadoop: The Definitive Guide, 3rd edn. O’Reilly Media Inc., Sebastopol (2012)

    Google Scholar 

  18. Wood, P.T.: Query languages for graph databases. SIGMOD Rec. 41(1), 50–60 (2012)

    Article  Google Scholar 

Download references

Acknowledgments

Our thanks to C. Caratheodory Research Program from University of Patras, Greece to support this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marios Kendea .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kendea, M., Gkantouna, V., Rapti, A., Sioutas, S., Tzimas, G., Tsolis, D. (2016). Graph DBs vs. Column-Oriented Stores: A Pure Performance Comparison. In: Karydis, I., Sioutas, S., Triantafillou, P., Tsoumakos, D. (eds) Algorithmic Aspects of Cloud Computing. ALGOCLOUD 2015. Lecture Notes in Computer Science(), vol 9511. Springer, Cham. https://doi.org/10.1007/978-3-319-29919-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-29919-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-29918-1

  • Online ISBN: 978-3-319-29919-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics