Abstract
The analysis and optimization of HPC I/O is a daunting task that is still unaddressed at large. The SIOX project aims to help HPC users and system administrators alike to improve the I/O performance of the applications by gaining awareness of the I/O operations taking place on the system and launching corrective measures when a problem is encountered. Given the size of modern HPC clusters and the corresponding amount of I/O they generate, the SIOX project faces a series of scalability challenges that need to be resolved. Beyond presenting the architecture and functioning of the SIOX system, this paper examines one of its biggest challenges, namely the transmission and management of large amount of event-based I/O trace information as well as the benefits the use of the trace compression techniques like ScalaTrace and C3G may convey.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
The compression ratio is defined as the uncompressed size divided by the compressed size of the trace [2].
References
Cleary, J.G., Witten, I.: Data compression using adaptive coding and partial string matching. IEEE Trans. Commun. 32(4), 396–402 (1984)
Knüpfer, A., Nagel, W.E.: Compressible memory data structures for event-based trace analysis. Future Gener. Comput. Syst. 22, 359–368 (2006). http://dx.doi.org/10.1016/j.future.2004.11.021
Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), Pacific Grove, pp. 1–11. IEEE (2012)
Mahoney, M.: Data compression explained. Available at http://mattmahoney.net/dc/dce.html (2011)
Nakka, N., Choudhary, A., Liao, W.K., Ward, L., Klundt, R., Weston, M.I.: Detailed analysis of I/O traces for large scale applications. In: 2012 IEEE 28th Symposium on High Performance Computing (HiPC), Kochi, pp. 419–427 (2009)
Noeth, M., Ratn, P., Mueller, F., Schulz, M., de Supinski, B.R.: ScalaTrace: scalable compression and replay of communication traces for high-performance computing. J. Parallel Distrib. Comput. 69, 696–710 (2009)
Pieterse, V., Black, P.E.: Algorithms and Theory of Computation Handbook. Available at: http://www.nist.gov/dads/HTML/singleprogrm.html (Dec 2004); Dictionary of Algorithms and Data Structures
Sandia National Laboratories: Scalable I/O project I/O traces. Available at: http://www.cs.sandia.gov/Scalable_IO/SNL_Trace_Data (2009)
Wiedemann, M.C., Kunkel, J.M., Zimmer, M., Ludwig, T., Resch, M., Bönisch, T., Wang, X., Chut, A., Aguilera, A., Nagel, W.E., Kluge, M., Mickler, H.: Towards I/O analysis of HPC systems and a generic architecture to collect access patterns. Comput. Sci. Res. Dev. 1, 1–11 (2012). http://dx.doi.org/10.1007/s00450-012-0221-5
Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory 23(3), 337–343 (1977)
Acknowledgements
We would like to express our gratitude to the German Aerospace Center (DLR) as the responsible agency for the SIOX project as well as to the German Federal Ministry of Education and Research (BMBF) for the financial support under grant 01 IH 11008 A-C. Our gratitude also extends to Andreas Knüpfer and Joachim Protze from the Center for Information Services and High Performance Computing (ZIH) at the TU Dresden for the expertise provided when we used the C3G library.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Aguilera, A., Mickler, H., Kunkel, J., Zimmer, M., Wiedemann, M., Müller-Pfefferkorn, R. (2014). A Comparison of Trace Compression Methods for Massively Parallel Applications in Context of the SIOX Project. In: Knüpfer, A., Gracia, J., Nagel, W., Resch, M. (eds) Tools for High Performance Computing 2013. Springer, Cham. https://doi.org/10.1007/978-3-319-08144-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-08144-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08143-4
Online ISBN: 978-3-319-08144-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)