Skip to main content

Evaluating Lossy Compression on Climate Data

  • Conference paper
Supercomputing (ISC 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7905))

Included in the following conference series:

Abstract

While the amount of data used by today’s high-performance computing (HPC) codes is huge, HPC users have not broadly adopted data compression techniques, apparently because of a fear that compression will either unacceptably degrade data quality or that compression will be too slow to be worth the effort. In this paper, we examine the effects of three lossy compression methods (GRIB2 encoding, GRIB2 using JPEG 2000 and LZMA, and the commercial Samplify APAX algorithm) on decompressed data quality, compression ratio, and processing time. A careful evaluation of selected lossy and lossless compression methods is conducted, assessing their influence on data quality, storage requirements and performance. The differences between input and decoded datasets are described and compared for the GRIB2 and APAX compression methods. Performance is measured using the compressed file sizes and the time spent on compression and decompression. Test data consists both of 9 synthetic data exposing compression behavior and 123 climate variables output from a climate model. The benefits of lossy compression for HPC systems are described and are related to our findings on data quality.

This paper is partly funded by the DFG (GZ: LU 1353/5-1).

We also thank Luis Kornblueh for providing us with the climate dataset, without which this paper would not have been possible.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Christopoulos, C., Skodras, A., Ebrahimi, T.: The JPEG2000 still image coding system: an overview. IEEE Transactions on Consumer Electronics 46(4), 1103–1127 (2000)

    Article  Google Scholar 

  2. Dey, C., et al.: Guide to the WMO Table Driven Code Form Used for the Representation and Exchange of Regularly Spaced Dat. In: Binary Form: FM 92 GRIB Edition 2. Tech. rep., World Meteorological Organization (2007), http://www.wmo.int/pages/prog/www/WMOCodes/Guides/GRIB/GRIB2_062006.pdf

  3. ECMA: Streaming lossless data compression algorithm - (sldc), ECMA Standart 321 (2001)

    Google Scholar 

  4. Hübbe, N., Kunkel, J.: Reducing the HPC-Datastorage Footprint with MAFISC - Multidimensional Adaptive Filtering Improved Scientific data Compression. In: Computer Science - Research and Development. Executive Committee. Springer, Heidelberg (2012), doi: http://dx.doi.org/10.1007/s00450-012-0222-4

    Google Scholar 

  5. Iverson, J., Kamath, C., Karypis, G.: Fast and effective lossy compression algorithms for scientific datasets. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 843–856. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Lakshminarasimhan, S., Shah, N., Ethier, S., Klasky, S., Latham, R., Ross, R., Samatova, N.F.: Compressing the incompressible with ISABELA: In-situ reduction of spatio-temporal data. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011, Part I. LNCS, vol. 6852, pp. 366–379. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  7. Lakshminarasimhan, S., Shah, N., Ethier, S., Ku, S.H., Chang, C.S., Klasky, S., Latham, R., Ross, R., Samatova, N.F.: Isabela for effective in situ compression of scientific data. Concurrency and Computation: Practice and Experience (2012)

    Google Scholar 

  8. Lindstrom, P., Isenburg, M.: Fast and efficient compression of floating-point data. IEEE Transactions on Visualization and Computer Graphics 12(5), 1245–1250 (2006)

    Article  Google Scholar 

  9. Sullivan, S.: Wavelet compression for floating point data–sengcom. Tech. rep., University Corporation for Atmospheric Research (2012), http://www.unidata.ucar.edu/software/netcdf/papers/sengcom.pdf

  10. Wegener, A.: Adaptive compression and decompression of bandlimited signals. US patent 7,009,533 (2006)

    Google Scholar 

  11. Woodring, J., Mniszewski, S., Brislawn, C., DeMarle, D., Ahrens, J.: Revisiting wavelet compression for large-scale climate data using JPEG2000 and ensuring data precision. In: 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 31–38 (2011), doi:10.1109/LDAV.2011.6092314

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hübbe, N., Wegener, A., Kunkel, J.M., Ling, Y., Ludwig, T. (2013). Evaluating Lossy Compression on Climate Data. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2013. Lecture Notes in Computer Science, vol 7905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38750-0_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38750-0_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38749-4

  • Online ISBN: 978-3-642-38750-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics