Abstract
Most of the researches being conducted in the area of cloud storage using Erasure Codes are mainly concentrated in either finding optimal solution for a lesser storage capacity or lesser bandwidth consumption. In this paper, our goal is to provide Erasure Code functionalities directly from the application layer. For this purpose, we reviewed some application layer languages, namely, Hive, Pig and Oozie, and opt for the addition EC support in Hive. We develop several Hive commands that allow Hive tables to be first archived and then encoded or decoded with different parameters, such as join and union. We test our implementation using the MovieLen Dataset locally and on the cloud. We also compare the performance against a replicated system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Esmaili, K.S., Pamies-Juarez, L., Datta, A.: The CORE storage primitive: cross-object redundancy for efficient data repair & access in erasure coded storage. CoRR, vol. abs/1302.5192 (2013)
Pamies-Juarez, L., Oggier, F.E., Datta, A.: Data insertion and archiving in erasure-coding based large-scale storage systems. In: ICDCIT, pp. 47–68 (2013)
Islam, M., Huang, A.K., Battisha, M., Chiang, M., Srinivasan, S., Peters, C., Neumann, A., Abdelnur, A.: Oozie: towards a scalable workflow management system for hadoop. In: Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies, p. 4. ACM (2012)
Gates, A.F., Natkovich, O., Chopra, S., Kamath, P., Narayanamurthy, S.M., Olston, C., Reed, B., Srinivasan, S., Srivastava, U.: Building a high-level dataflow system on top of map-reduce: the Pig experience. Proc. VLDB Endow. 2(2), 1414–1425 (2009)
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626–1629 (2009)
Plank, J.S., Greenan, K.M.: Jerasure: A library in C facilitating erasure coding for storage applications–version 2.0. Technical Report UT-EECS-14-721. University of Tennessee (2014)
Beach, B.: Backblaze releases the Reed-Solomon Java Library for free. Backblaze Blog| Cloud Storage & Cloud Backup (2017). https://www.backblaze.com/blog/reed-solomon. Accessed 3 Aug 2017
GitHub: openstack/liberasurecode (2017). https://github.com/openstack/liberasurecode. Accessed 3 Aug 2017
Hadoop.apache.org: WebHDFS REST API (2017). https://hadoop.apache.org/docs/r1.0.4/webhdfs.html. Accessed 10 July 2017
Chandole, N.S., Kulkarni, C.S., Surwase, M.D., Shelake, S.M.: Study of HIVE Tool for Big Data used in Facebook. Ijsrd.com (2017). http://ijsrd.com/Article.php?manuscript=IJSRDV5I30070. Accessed 1 Aug 2017
Fitzgerald, N.: Using data archiving tools to preserve archival records in business systems—a case study. iPRES (2013)
KEEP SOLUTIONS: RODA | Repository of Authentic Digital Objects (2017). http://www.keep.pt/produtos/roda/?lang=en. Accessed 22 Nov 2017
Loc.gov.: SIARD (Software Independent Archiving of Relational Databases) Version 1.0 (2017). https://www.loc.gov/preservation/digital/formats/fdd/fdd000426.shtml. Accessed 2 Aug 2017
Saas.hpe.com.: Application Archiving & Retirement Software, Structured Data | Hewlett Packard Enterprise (2017). https://saas.hpe.com/en-us/software/application-database-archiving. Accessed 29 July 2017
Brandl, S., Keller-Marxer, P.: Long-term archiving of relational databases with Chronos. In: First International Workshop on Database Preservation (PresDB 2007), Edinburgh (2007)
Dev.mysql.com:. MySQL :: MySQL 5.7 Reference Manual :: 4.5.4 mysqldump—A Database Backup Program (2017). https://dev.mysql.com/doc/en/mysqldump.html. Accessed 9 Aug 2017
Acknowledgement
We thank Associate Professor Anwitaman Datta from NTU, Singapore, for his constant support and expertise reviews that greatly assisted the research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chiniah, A., Einstein, M.U.A. (2019). HIVE-EC: Erasure Code Functionality in HIVE Through Archiving. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Advances in Information and Communication Networks. FICC 2018. Advances in Intelligent Systems and Computing, vol 887. Springer, Cham. https://doi.org/10.1007/978-3-030-03405-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-03405-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03404-7
Online ISBN: 978-3-030-03405-4
eBook Packages: EngineeringEngineering (R0)