Tools for Analyzing Parallel I/O

  • Julian Martin KunkelEmail author
  • Eugen Betke
  • Matt Bryson
  • Philip Carns
  • Rosemary Francis
  • Wolfgang Frings
  • Roland Laifer
  • Sandra Mendez
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11203)


Parallel application I/O performance often does not meet user expectations. Additionally, slight access pattern modifications may lead to significant changes in performance due to complex interactions between hardware and software. These issues call for sophisticated tools to capture, analyze, understand, and tune application I/O.

In this paper, we highlight advances in monitoring tools to help address these issues. We also describe best practices, identify issues in measurement and analysis, and provide practical approaches to translate parallel I/O analysis into actionable outcomes for users, facility operators, and researchers.



PIOM-MP is a work partially supported by the MICINN/MINECO Spain under contracts TIN2014-53172-P and TIN2017-84875-P. This material is based in part on work supported by the U.S. Department of Energy, Office of Science, under contract DE-AC02-06CH11357. We thank Felicity Pitchers for the proofreading.


  1. 1.
    Benson, T., Anand, A., Akella, A., Zhang, M.: Understanding data center traffic characteristics. SIGCOMM Comput. Commun. Rev. 40(1), 92–99 (2010)CrossRefGoogle Scholar
  2. 2.
    Bergman, K., et al.: Exascale computing study: technology challenges in achieving exascale systems. Technical report 15, Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO) (2008)Google Scholar
  3. 3.
    Shende, S., Malony, A.D., Ansell-bell, R.: Instrumentation and measurement strategies for flexible and portable empirical performance evaluation. In: International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA, pp. 1150–1156 (2001)Google Scholar
  4. 4.
    Carns, P., Latham, R., Ross, R., Iskra, K., Lang, S., Riley, K.: 24/7 characterization of petascale I/O workloads. In: Proceedings of 2009 Workshop on Interfaces and Architectures for Scientific Data Storage. IEEE (2009)Google Scholar
  5. 5.
    Vijayakumar, K., Mueller, F., Ma, X., Roth, P.C.: Scalable I/O tracing and analysis. In: Proceedings of the 4th Annual Workshop on Petascale Data Storage, pp. 26–31. ACM (2009)Google Scholar
  6. 6.
    Adams, I., Madden, B., Frank, J., Storer, M.W., Miller, E.L.: Usage behavior of a large-scale scientific archive. In: Proceedings of the 2012 International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2012) (2012)Google Scholar
  7. 7.
    Adams, I.F., Storer, M.W., Miller, E.L.: Analysis of workload behavior in scientific and historical long-term data repositories. ACM Trans. Storage 8(2), 6:1–6:27 (2012)CrossRefGoogle Scholar
  8. 8.
    Wang, F., et al.: File system workload analysis for large scale scientific computing applications. In: Proceedings of the 21st IEEE/12th NASA Goddard Conference on Mass Storage Systems and Technologies, College Park, MD, pp. 139–152, April 2004Google Scholar
  9. 9.
    Grawinkel, M., Nagel, L., Mäsker, M., Padua, F., Brinkmann, A., Sorth, L.: Analysis of the ECMWF storage landscape. In: 13th USENIX Conference on File and Storage Technologies (FAST 2015), Santa Clara, CA, pp. 15–27. USENIX Association (2015)Google Scholar
  10. 10.
    Carns, P., et al.: Understanding and improving computational science storage access through continuous characterization. ACM Trans. Storage (TOS) 7(3), 8 (2011)Google Scholar
  11. 11.
    Carns, P.: Darshan. In: High Performance Parallel I/O. Computational Science Series, pp. 309–315. Chapman & Hall/CRC (2015)Google Scholar
  12. 12.
    Knüpfer, A., et al.: Score-P: a joint performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir. In: Brunst, H., Müller, M., Nagel, W., Resch, M. (eds.) Tools for High Performance Computing 2011, pp. 79–91. Springer, Heidelberg (2012). Scholar
  13. 13.
    Kunkel, J.M., et al.: The SIOX architecture – coupling automatic monitoring and optimization of parallel I/O. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 245–260. Springer, Cham (2014). Scholar
  14. 14.
    Betke, E., Kunkel, J.: Real-time I/O-monitoring of HPC applications with SIOX, elasticsearch, Grafana and FUSE. In: Kunkel, J.M., Yokota, R., Taufer, M., Shalf, J. (eds.) ISC High Performance 2017. LNCS, vol. 10524, pp. 174–186. Springer, Cham (2017). Scholar
  15. 15.
    Mendez, S., Rexachs, D., Luque, E.: Modeling parallel scientific applications through their input/output phases. In: 2012 IEEE International Conference on Cluster Computing Workshops (CLUSTER WORKSHOPS), pp. 7–15, September 2012Google Scholar
  16. 16.
    Mendez, S., Panadero, J., Wong, A., Rexachs, D., Luque, E.: A new approach for analyzing I/O in parallel scientific applications. In: CACIC 12, Congreso Argentino de Ciencias de la Computación, pp. 337–346 (2012)Google Scholar
  17. 17.
    Gomez-Sanchez, P., Mendez, S., Rexachs, D., Luque, E.: PIOM-PX: a framework for modeling the I/O behavior of parallel scientific applications. In: Kunkel, J.M., Yokota, R., Taufer, M., Shalf, J. (eds.) ISC High Performance 2017. LNCS, vol. 10524, pp. 160–173. Springer, Cham (2017). Scholar
  18. 18.
    Mendez, S., Rexachs, D., Luque, E.: Analyzing the parallel I/O severity of MPI applications. In: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017, Madrid, Spain, 14–17 May 2017, pp. 953–962 (2017)Google Scholar
  19. 19.
    Yin, Y., Byna, S., Song, H., Sun, X.H., Thakur, R.: Boosting application-specific parallel I/O optimization using IOSIG. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID 2012), pp. 196–203. IEEE Computer Society (2012)Google Scholar
  20. 20.
    Wright, S.A., et al.: Parallel file system analysis through application I/O tracing. Comput. J. 56(2), 141–155 (2012)CrossRefGoogle Scholar
  21. 21.
    Intel (2011–2017), Oracle(2010–2011): Lustre Software Release 2.x, Operations Manual. Chapter 12.2Google Scholar
  22. 22.
    Lustre-Community: Lustre Monitoring and Statistics Guide. Chapters 6.2.7 and 6.2.8.
  23. 23.
    Uselton, A.: Deploying server-side file system monitoring at NERSC. In: Proceedings of the 2009 Cray User Group (2009)Google Scholar
  24. 24.
    Ludwig, T., Krempel, S., Kuhn, M., Kunkel, J., Lohse, C.: Analysis of the MPI-IO optimization levels with the PIOViz jumpshot enhancement. In: Cappello, F., Herault, T., Dongarra, J. (eds.) EuroPVM/MPI 2007. LNCS, vol. 4757, pp. 213–222. Springer, Heidelberg (2007). Scholar
  25. 25.
    Eschweiler, D., Wagner, M., Geimer, M., Knüpfer, A., Nagel, W.E., Wolf, F.: Open Trace Format 2: the next generation of scalable trace formats and support libraries. In: PARCO, vol. 22, pp. 481–490 (2011)Google Scholar
  26. 26.
    Smith, I.: Guide to using SQL: computed and automatic columns. Rdb J. (2008)Google Scholar
  27. 27.
    Armbrust, M., et al.: Spark SQL: relational data processing in Spark. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD 2015, pp. 1383–1394. ACM, New York (2015)Google Scholar
  28. 28.
    Lockwood, G.K., et al.: UMAMI: a recipe for generating meaningful metrics through holistic I/O performance analysis. In: Proceedings of the 2nd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, pp. 55–60. ACM (2017)Google Scholar
  29. 29.
    Lockwood, G.K., Snyder, S., Brown, G., Harms, K., Carns, P., Wright, N.J.: TOKIO on ClusterStor: connecting standard tools to enable holistic I/O performance analysis. In: Proceedings of the 2018 Cray User Group (2018)Google Scholar
  30. 30.
    Jasak, H., Jemcov, A., Tukovic, Z., et al.: OpenFOAM: a C++ library for complex physics simulations. In: International Workshop on Coupled Methods in Numerical Dynamics, vol. 1000, pp. 1–20. IUC Dubrovnik, Croatia (2007)Google Scholar
  31. 31.
    Ellexus, Alces-Flight: Maximising HPC performance on AWS public cloud.
  32. 32.
    Frings, W., Karbach, C.: LLview: graphical monitoring of batch system controlled cluster (2004, 2018).
  33. 33.
    Karbach, C.: A highly configurable and efficient simulator for job schedulers on supercomputers. PARS-Mitt. 30(1), 25–36 (2013)CrossRefGoogle Scholar
  34. 34.
    Karbach, C.: LML: large-scale system markup language (2013).
  35. 35.
    Watson, G.R., Frings, W., Knobloch, C., Karbach, C., Rossi, A.L.: Scalable control and monitoring of supercomputer applications using an integrated tool framework. In: 2011 40th International Conference on Parallel Processing Workshops, pp. 457–466, September 2011Google Scholar
  36. 36.
  37. 37.
    Peters, A., Sindrilaru, E., Adde, G.: EOS as the present and future solution for data storage at CERN. J. Phys.: Conf. Ser. 664(4), 042042 (2015)Google Scholar
  38. 38.
    Peters, A.J., Janyst, L.: Exabyte scale storage at CERN. J. Phys.: Conf. Ser. 331(5), 052015 (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Julian Martin Kunkel
    • 1
    Email author
  • Eugen Betke
    • 2
  • Matt Bryson
    • 3
  • Philip Carns
    • 4
  • Rosemary Francis
    • 5
  • Wolfgang Frings
    • 6
  • Roland Laifer
    • 7
  • Sandra Mendez
    • 8
  1. 1.University of ReadingReadingUK
  2. 2.German Climate Computing Center (DKRZ)HamburgGermany
  3. 3.University of CaliforniaSanta CruzUSA
  4. 4.Argonne National LaboratoryLemontUSA
  5. 5.Ellexus Ltd.CambridgeUK
  6. 6.Jülich Supercomputing Centre (JSC)JuelichGermany
  7. 7.Karlsruhe Institute of Technology (KIT)KarlsruheGermany
  8. 8.Leibniz Supercomputing Centre (LRZ)MünchenGermany

Personalised recommendations