Advertisement

Performance Evaluation of NoSQL Databases

  • Andrea Gandini
  • Marco Gribaudo
  • William J. Knottenbelt
  • Rasha Osman
  • Pietro Piazzolla
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8721)

Abstract

NoSQL databases have emerged as a backend to support Big Data applications. NoSQL databases are characterized by horizontal scalability, schema-free data models, and easy cloud deployment. To avoid overprovisioning, it is essential to be able to identify the correct number of nodes required for a specific system before deployment. This paper benchmarks and compares three of the most common NoSQL databases: Cassandra, MongoDB and HBase. We deploy them on the Amazon EC2 cloud platform using different types of virtual machines and cluster sizes to study the effect of different configurations. We then compare the behavior of these systems to high-level queueing network models. Our results show that the models are able to capture the main performance characteristics of the studied databases and form the basis for a capacity planning tool for service providers and service users.

Keywords

Virtual Machine Replication Factor Data Node Read Request Queue Network Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Barbierato, E., Gribaudo, M., Iacono, M.: Performance evaluation of nosql big-data applications using multi-formalism models. Future Generation Computer Systems (2013) (to appear) (available online)Google Scholar
  2. 2.
    Bertoli, M., Casale, G., Serazzi, G.: JMT: Performance engineering tools for system modeling. SIGMETRICS Perform. Eval. Rev. 36(4), 10–15 (2009)CrossRefGoogle Scholar
  3. 3.
    Castiglione, A., Gribaudo, M., Iacono, M., Palmieri, F.: Exploiting mean field analysis to model performances of big data architectures. Future Generation Computer Systems (2013) (article in press); cited by (since 1996)Google Scholar
  4. 4.
    Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39(4), 12–27 (2011)CrossRefGoogle Scholar
  5. 5.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4:1–4:26 (2008)Google Scholar
  6. 6.
    Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, pp. 143–154. ACM, New York (2010)Google Scholar
  7. 7.
    Coulden, D., Osman, R., Knottenbelt, W.J.: Performance modelling of database contention using queueing petri nets. In: ICPE, pp. 331–334 (2013)Google Scholar
  8. 8.
    Cudré-Mauroux, P., et al.: NoSQL databases for RDF: An empirical evaluation. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 310–325. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  9. 9.
    Db engines. Db-engines ranking of database management systems (March 2014) (accessed: March 04, 2014)Google Scholar
  10. 10.
    De Candia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)Google Scholar
  11. 11.
    Di Sanzo, P., Palmieri, R., Ciciani, B., Quaglia, F., Romano, P.: Analytical modeling of lock-based concurrency control with arbitrary transaction data access patterns. In: WOSP/SIPEW 2010, pp. 69–78. ACM, New York (2010)Google Scholar
  12. 12.
    Elnikety, S., Dropsho, S., Cecchet, E., Zwaenepoel, W.: Predicting replicated database scalability from standalone database profiling. In: EuroSys 2009, pp. 303–316. ACM, New York (2009)Google Scholar
  13. 13.
    Apache Software Foundation. Cassandra, http://cassandra.apache.org/ (accessed: March 04, 2014)
  14. 14.
    Apache Software Foundation. Hbase project, https://hbase.apache.org/ (accessed: March 04, 2014)
  15. 15.
    Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)CrossRefGoogle Scholar
  16. 16.
    Lakshman, A., Malik, P.: Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)CrossRefGoogle Scholar
  17. 17.
    Lazowska, E.D., Zahorjan, J., Graham, G.S., Sevcik, K.C.: Quantitative System Performance. Prentice-Hall (1984)Google Scholar
  18. 18.
    MongoDB, Inc. Mongodb, http://www.mongodb.org/ (accessed: March 04, 2014)
  19. 19.
    Nicola, M., Jarke, M.: Performance modeling of distributed and replicated databases. IEEE Trans. on Knowl. and Data Eng. 12(4), 645–672 (2000)CrossRefGoogle Scholar
  20. 20.
    Oracle. Oracle nosql database. An oracle white paper. white paper (September 2011)Google Scholar
  21. 21.
    Osman, R., Awan, I., Woodward, M.E.: Queped: Revisiting queueing networks for the performance evaluation of database designs. Simulation Modelling Practice and Theory 19(1), 251–270 (2011)CrossRefGoogle Scholar
  22. 22.
    Osman, R., Coulden, D., Knottenbelt, W.J.: Performance modelling of concurrency control schemes for relational databases. In: Dudin, A., De Turck, K. (eds.) ASMTA 2013. LNCS, vol. 7984, pp. 337–351. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  23. 23.
    Osman, R., Knottenbelt, W.J.: Database system performance evaluation models: A survey. Perform. Eval. 69(10), 471–493 (2012)CrossRefGoogle Scholar
  24. 24.
    Osman, R., Piazzolla, P.: Modelling replication in nosql datastores. In: QEST (2014)Google Scholar
  25. 25.
    Rabl, T., Sadoghi, M., Jacobsen, H.-A., Gómez-Villamor, S., Muntés-Mulero, V., Mankowskii, S.: Solving big data challenges for enterprise application performance management. PVLDB 5(12), 1724–1735 (2012)Google Scholar
  26. 26.
    Weber, S.: Nosql databases. University of Applied Sciences HTW Chur, SwitzerlandGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Andrea Gandini
    • 1
  • Marco Gribaudo
    • 1
  • William J. Knottenbelt
    • 2
  • Rasha Osman
    • 2
  • Pietro Piazzolla
    • 1
  1. 1.Politecnico di MilanoMilanoItaly
  2. 2.Department of ComputingImperial College LondonLondonUK

Personalised recommendations