Using Simulation to Validate Performance of MPI(-IO) Implementations

Kunkel, Julian Martin

doi:10.1007/978-3-642-38750-0_14

Julian Martin Kunkel¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7905))

Included in the following conference series:

International Supercomputing Conference

2384 Accesses
2 Citations

Abstract

Parallel file systems and MPI implementations aim to exploit available hardware resources in order to achieve optimal performance. Since performance is influenced by many hardware and software factors, achieving optimal performance is a daunting task. For these reasons, optimized communication and I/O algorithms are still subject to research. While complexity of collective MPI operations is discussed in literature sometimes, theoretic assessment of the measurements is de facto non-existent. Instead, conducted analysis is typically limited to performance comparisons to previous algorithms.

However, observable performance is not only determined by the quality of an algorithm. At run-time performance could be degraded due to unexpected implementation issues and triggered hardware and software exceptions. By applying a model that resembles the system, simulation allows us to estimate the performance. With this approach, the non-function requirement for performance of an implementation can be validated and run-time inefficiencies can be localized.

In this paper we demonstrate how simulation can be applied to assess observed performance of collective MPI calls and parallel IO. PIOsimHD, an event-driven simulator, is applied to validate observed performance on our 10 node cluster. The simulator replays recorded application activity and point-to-point operations of collective operations. It also offers the option to record trace files for visual comparison to recorded behavior. With the innovative introspection into behavior, several bottlenecks in system and implementation are localized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of Collective Communication Operations in MPICH. International Journal of High Performance Computing Applications 19(1), 49–66 (2005)
Article Google Scholar
Faraj, A., Yuan, X., Lowenthal, D.: STAR-MPI: Self Tuned Adaptive Routines for MPI Collective Operations. In: Proceedings of the 20th Annual International Conference on Supercomputing, ICS, pp. 199–208. ACM, New York (2006)
Chapter Google Scholar
Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: MagPIe: MPI’s Collective Communication Operations for Clustered Wide Area Systems. In: Proceedings of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP, pp. 131–140. ACM, New York (1999)
Google Scholar
Miller, S., Kendall, R.: Implementing Optimized MPI Collective Communication Routines on the IBM BlueGene/L Supercomputer. Technical report, Iowa State University (2005)
Google Scholar
Gabriel, E., Huang, S.: Runtime Optimization of Application Level Communication Patterns. In: International Parallel & Distributed Processing Symposium, IPDPS, pp. 1–8. IEEE (2007)
Google Scholar
Thakur, R., Gropp, W., Lusk, E.: Optimizing Noncontiguous Accesses in MPI-IO. Parallel Computing 28, 83–105 (2002)
Article Google Scholar
Singh, D.E., Isaila, F., Pichel, J.C., Carretero, J.: A Collective I/O Implementation Based on Inspector–Executor Paradigm. The Journal of Supercomputing 47(1), 53–75 (2009)
Article Google Scholar
Worringen, J.: Self-adaptive hints for collective I/O. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 202–211. Springer, Heidelberg (2006)
Chapter Google Scholar
Buntinas, D., Mercier, G., Gropp, W.: Design and evaluation of Nemesis, a scalable, low-latency, message-passing communication subsystem. In: Sixth IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2006, vol. 1, p. 10 (2006)
Google Scholar
Graham, R., Shipman, G., Barrett, B., Castain, R., Bosilca, G., Lumsdaine, A.: Open MPI: A high-performance, heterogeneous MPI. In: 2006 IEEE International Conference on Cluster Computing, pp. 1–9 (2006)
Google Scholar
Kunkel, J., Ludwig, T.: Performance Evaluation of the PVFS2 Architecture. In: PDP 2007: Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Euromicro, pp. 509–516 (2007)
Google Scholar
Rodrigues, A.F., Murphy, R.C., Kogge, P., Underwood, K.D.: The Structural Simulation Toolkit: Exploring Novel Architectures. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC. ACM, New York (2006)
Google Scholar
Hoefler, T., Schneider, T., Lumsdaine, A.: LogGOPSim: Simulating Large-Scale Applications in the LogGOPS Model. In: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC, pp. 597–604. ACM, New York (2010)
Chapter Google Scholar
Girona, S., Labarta, J., Badía, R.M.: Validation of Dimemas Communication Model for MPI Collective Operations. In: Dongarra, J., Kacsuk, P., Podhorszki, N. (eds.) PVM/MPI 2000. LNCS, vol. 1908, pp. 39–46. Springer, Heidelberg (2000)
Chapter Google Scholar
Hermanns, M.A., Geimer, M., Wolf, F., Wylie, B.J.N.: Verifying Causality between Distant Performance Phenomena in Large-Scale MPI Applications. In: Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pp. 78–84 (2009)
Google Scholar
Tu, B., Fan, J., Zhan, J., Zhao, X.: Accurate Analytical Models for Message Passing on Multi-core Clusters. In: Proceedings of the 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pp. 133–139 (2009)
Google Scholar
Cope, J., Liu, N., Lang, S., Carns, P., Carothers, C., Ross, R.: CODES: Enabling Co-design of Multilayer Exascale Storage Architectures. In: Proceedings of the Workshop on Emerging Supercomputing Technologies 2011 (2011)
Google Scholar
Kunkel, J.: Simulating Parallel Programs on Application and System Level. Computer Science – Research and Development (online first) (May 2012)
Google Scholar
Kuhn, M., Kunkel, J., Ludwig, T.: Simulation-Aided Performance Evaluation of Server-Side Input/Output Optimizations. In: 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, pp. 562–566 (2012)
Google Scholar
Mordvinova, O., Runz, D., Kunkel, J., Ludwig, T.: I/O Performance Evaluation with Parabench – Programmable I/O Benchmark. Procedia Computer Science, 2119–2128 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Hamburg, Bundesstraße 45a, 20146, Hamburg, Germany
Julian Martin Kunkel

Authors

Julian Martin Kunkel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Hamburg, Department of Informatics, Bundestraße 45a, 20146, Hamburg, Germany
Julian Martin Kunkel
Deutsches Klimarechenzentrum, Bundestraße 45a, 20146, Hamburg, Germany
Thomas Ludwig
Germany and Prometeus GmbH, University of Mannheim, Fliederstraße 2, 74915, Waibstadt, Germany
Hans Werner Meuer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kunkel, J.M. (2013). Using Simulation to Validate Performance of MPI(-IO) Implementations. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2013. Lecture Notes in Computer Science, vol 7905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38750-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-38750-0_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38749-4
Online ISBN: 978-3-642-38750-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics