Abstract
ZLIB is used in diverse frameworks by the scientific community, both to reduce disk storage and to alleviate pressure on I/O. As it becomes a bottleneck on multi-core systems, higher throughput alternatives must be considered, exploring parallelism and/or more effective compression schemes. This work provides a comparative study of the ZLIB, LZ4 and FPC compressors (serial and parallel implementations), focusing on CR, bandwidth and speedup. LZ4 provides very high throughput (decompressing over 1GB/s versus 120MB/s for ZLIB) but its CR suffers a degradation of 5-10%. FPC also provides higher throughputs than ZLIB, but the CR varies a lot with the data. ZLIB and LZ4 can achieve almost linear speedups for some datasets, while current implementation of parallel FPC provides little if any performance gain. For the ROOT dataset, LZ4 was found to provide higher CR, scalability and lower memory consumption than FPC, thus emerging as a better alternative to ZLIB.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bell, G., Gray, J., Szalay, A.: Petascale computational systems. Computer 39(1), 110–112 (2006)
Hilbert, M., López, P.: The worlds technological capacity to store, communicate, and compute information. Science 332(6025), 60–65 (2011)
Staff, S.: Challenges and opportunities. Science 331(6018), 692–693 (2011)
Lohr, S.: The age of big data. The New York Times (February 11, 2012)
Search for pair production of heavy top-like quarks decaying to a high-p T W boson and a b quark in the lepton plus jets final state at \(\sqrt{s}\)=7 TeV with the ATLAS detector
Oliveira, V., Pina, A., N.C.F.V.A.O.: Even bigger data: Preparing for the LHC/atlas upgrade. Ibergrid 2012 submission (November 2012)
Schendel, E., Jin, Y., Shah, N., Chen, J., Chang, C., Ku, S.H., Ethier, S., Klasky, S., Latham, R., Ross, R., Samatova, N.: ISObar preconditioner for effective and high-throughput lossless data compression. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 138–149 (April 2012)
Schendel, E.R., Pendse, S.V., Jenkins, J., Boyuka II, D.A., Gong, Z., Lakshminarasimhan, S., Liu, Q., Kolla, H., Chen, J., Klasky, S., Ross, R., Samatova, N.F.: ISObar hybrid compression-I/O interleaving for large-scale parallel I/O optimization. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2012, pp. 61–72. ACM, New York (2012)
Lakshminarasimhan, S., Shah, N., Ethier, S., Ku, S.H., Chang, C.S., Klasky, S., Latham, R., Ross, R., Samatova, N.F.: Isabela for effective in situ compression of scientific data. Concurrency and Computation: Practice and Experience 25(4), 524–540 (2013)
Brun, R., Rademakers, F.: Root - an object oriented data analysis framework. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 389(1-2), 81–86 (1997), New Computing Techniques in Physics Research V
Nicolaucig, A., Mattavelli, M., Carrato, S.: Compression of tpc data in the alice experiment. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 487(3), 542–556 (2002)
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory 23(3), 337–343 (1977)
Collet, Y.: Development blog on compression algorithms, http://fastcompression.blogspot.in/2011/05/lz4-explained.html
Burtscher, M., Ratanaworabhan, P.: High throughput compression of double-precision floating-point data. In: Data Compression Conference, DCC 2007, pp. 293–302 (March 2007)
Burtscher, M., Ratanaworabhan, P.: FPC: A high-speed compressor for double-precision floating-point data. IEEE Transactions on Computers 58(1), 18–31 (2009)
Welton, B., Kimpe, D., Cope, J., Patrick, C., Iskra, K., Ross, R.: Improving I/O forwarding throughput with data compression. In: 2011 IEEE International Conference on Cluster Computing (CLUSTER), pp. 438–445 (September 2011)
Peters, A.J.: Lz4hc compression for root and io baseline evaluation. In: ROOT IO Workshop (December 2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Almeida, S., Oliveira, V., Pina, A., Melle-Franco, M. (2014). Two High-Performance Alternatives to ZLIB Scientific-Data Compression. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8582. Springer, Cham. https://doi.org/10.1007/978-3-319-09147-1_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-09147-1_45
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09146-4
Online ISBN: 978-3-319-09147-1
eBook Packages: Computer ScienceComputer Science (R0)