Abstract
Analytic database is designed for analytical applications which aim to explore the value of massive data. It has been widely used in many areas, from business analytics to scientific data discovery. In order to efficiently processing massive data, analytic database often can be scaled-out to achieve high performance. In this paper, in order to understand the intrinsic performance characteristics of analytic database, we take the popular Greenplum database as the representative analytic database system, and conduct a series of comprehensive performance evaluation over it to characterize its scalability and performance features. According to the experimental results and analysis, we obtained an series of initial insights of the factors which may significantly affect the performance paradigm of Greenplum, which will be helpful for analytic database system users to obtain better analytical performance.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Färber, F., Cha, S.K., Primsch, J., Bornhövd, C.: SAP HANA database: data management for modern business applications. ACM SIGMOD Rec. 40(4), 45–51 (2012). ACM, New York
Lee, J., Kwon, Y.S., Färber, F., Muehle, M., Lee, C.: SAP HANA distributed in-memory database system: transaction, session, and metadata management. In: 2013 IEEE 29th International Conference, pp. 1165–1173. IEEE (2013)
Huang, J.: Research on data storage of eID. In: 2012 2nd IEEE International Conference on Computer Science and Network Technology (ICCSNT), pp. 1843–1846. IEEE (2012)
Soliman, M.A., Antova, L., Raghavan, V., El-Helw, A.: Orca: a modular query optimizer architecture for big data. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 337–348. ACM, New York (2014)
Rajput, E., Yadav, H., Singh, A.: Comparative study of EMC greenplum and oracle exadata. J. Eng. Comput. Appl. Sci. 2, 50–54 (2013). BORJ
Jha, M., Jha, S.: Integrating big data solutions into enterprize architecture: constructing the entire information landscape. In: SDIWC, pp. 3–10 (2015)
Pivotal Greenplum Database Documentation. http://gpdb.docs.pivotal.io/4380/common/welcome.html
da Silva Fernandes, F.: Parallel relational databases for diameter calculation of large graphs. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), pp. 213–220 (2016)
TPC-H benchmark specification. http://www.tcp.org/hspec.html
Acknowledgments
This work was supported by the China Ministry of Science and Technology under the State Key Development Program for Basic Research (2012CB821800), Fund of National Natural Science Foundation of China (No. 61462012, 61562010), the Joint Research Fund in Astronomy under cooperative agreement between the National Natural Science Foundation of China and Chinese Academy of Sciences (No. U1531246), the Strategic Priority Research Program “The Emergence of Cosmological Structures” of the Chinese Academy of Sciences (No. XDB09000000), High Tech. Project Fund of Guizhou Development and Reform Commission (No. [2013]2069), Industrial Research Projects of the Science and Technology Plan of Guizhou Province (No. GY[2014]3018), the Major Applied Basic Research Program of Guizhou Province (No. JZ20142001, JZ20142001-01, JZ20142001-05).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Li, Y., Li, H., Chen, M., Dai, Z., Zhu, M. (2016). Characterizing the Scalability and Performance of Analytic Database System. In: Wang, G., Ray, I., Alcaraz Calero, J., Thampi, S. (eds) Security, Privacy and Anonymity in Computation, Communication and Storage. SpaCCS 2016. Lecture Notes in Computer Science(), vol 10067. Springer, Cham. https://doi.org/10.1007/978-3-319-49145-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-49145-5_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49144-8
Online ISBN: 978-3-319-49145-5
eBook Packages: Computer ScienceComputer Science (R0)