Compressing unstructured mesh data from simulations using machine learning
- 8 Downloads
The amount of data output from a computer simulation has grown to terabytes and petabytes as increasingly complex simulations are being run on massively parallel systems. As we approach exaflop computing in the next decade, it is expected that the I/O subsystem will not be able to write out these large volumes of data. In this paper, we explore the use of machine learning to compress the data before it is written out. Despite the computational constraints that limit us to using very simple learning algorithms, our results show that machine learning is a viable option for compressing unstructured data. We demonstrate that by simply using a better sampling algorithm to generate the training set, we can obtain more accurate results compared to random sampling, but at no extra cost. Further, by carefully selecting and incorporating points with high prediction error, we can improve reconstruction accuracy without sacrificing the compression rate.
KeywordsRegression Compression Computer simulations Mesh data
I thank the reviewers of both the original DSAA’2017 paper, and this extended version, for their careful review and thoughtful suggestions for improvements. I also thank Prof. Zhihong Lin, from UC Irvine, for providing access to the data generated as part of the GSEP SciDAC project. This work was funded by the ASCR Program (Dr. Lucille Nowell, Program Manager) at the Office of Science, US Department of Energy. LLNL-JRNL-750460 This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Compliance with ethical standards
Conflict of interest
The author states that there is no conflict of interest.
- 1.Atkeson, C., Schaal, S.A., Moore, A.W.: Locally weighted learning. AI Rev. 11, 75–133 (1997)Google Scholar
- 2.Bridson, R.: Fast Poisson disk sampling in arbitrary dimensions. In: ACM SIGGRAPH 2007 Sketches, SIGGRAPH ’07. ACM, New York (2007). https://doi.org/10.1145/1278780.1278807
- 3.Chen, Z., Son, S.W., Hendrix, W., Agrawal, A., Liao, W.k., Choudhary, A.: NUMARCK: machine learning algorithm for resiliency and checkpointing. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’14, pp. 733–744. IEEE Press, Piscataway (2014). https://doi.org/10.1109/SC.2014.65
- 4.Cheng, L., Vishwanathan, S.V.N.: Learning to compress images and videos. In: Proceedings of the 24th International Conference on Machine Learning, ICML ’07, pp. 161–168. ACM, New York (2007). https://doi.org/10.1145/1273496.1273517
- 5.Childs, H., et al.: In situ processing. In: Bethel, E.W., Childs, H., Hansen, C. (eds.) High Performance Visualization-Enabling Extreme-Scale Scientific Insight, pp. 171–198. CRC Press/Francis-Taylor Group, Boca Raton (2012)Google Scholar
- 6.Di, S., Cappello, F.: Fast error-bounded lossy HPC data compression with SZ. In: Proceedings of the International Parallel and Distributed Processing Symposium, pp. 730–739. IEEE (2016)Google Scholar
- 8.Iverson, J., Kamath, C., Karypis, G.: Fast and effective lossy compression algorithms for scientific datasets. In: Proceedings of the 18th International Conference on Parallel Processing, Euro-Par’12, Berlin, pp. 843–856 (2012)Google Scholar
- 9.Kamath, C.: Learning to compress unstructured mesh data from simulations. In: 2017 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2017, Tokyo, Japan, October 19–21, 2017, pp. 621–630 (2017)Google Scholar
- 17.Salloum, M., Fabian, N., Hensinger, D.M., Templeton, J.A.: Compressed sensing and reconstruction of unstructured mesh datasets. arXiv:1508.06314 (2015)