Abstract
This paper analyzes the performance and scalability characteristics of both the computational and I/O components of the Parallel Ice Sheet Model (PISM) executing in a multicore supercomputing environment. It examines the impact of multicore technologies on two state-of-the-art parallel I/O systems, both of which are based on the same underlying implementation of the MPI-IO standard, but which exhibit very different performance and scalability characteristics. It also examines these same characteristics for the MPI-based computational engine of the simulation model. One important benefit of studying these three software systems both independently and together is that it exposes a fundamental tradeoff in the ability to provide scalable I/O and scalable computational performance in a multicore environment. This paper also provides what, at least at first glance, appears to be very counter-intuitive performance results. We examine the underlying reasons for such results, and discuss the important insights gained through this examination.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balay, S., et al.: PETSc Users Manual. Technical report #ANL-95/11 - Revision 3.6. Argonne National Laboratory
Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhäuser Press, Boston (1997)
Baylor, S.J., Rathi, B.D.: An evaluation of the memory reference behavior of engineering/scientific applications in parallel systems. Int. J. High Speed Comput. 1(4), 603–641 (1989)
de Boer, B., Dolan, A.M., Bernales, J., Gasson, E., Goelzer, H., Golledge, N.R., Sutter, J., Huybrechts, P., Lohmann, G., Rogozhina, I., Abe-Ouchi, A., Saito, F., van de Wal, R.S.W.: Simulating the Antarctic ice sheet in the late-Pliocene warm period: PLISMIP-ANT, an ice-sheet model intercomparison project. Cryosphere. 9(3), 881–903 (2015)
Bueler, E., van Pelt, W.: Mass-conserving subglacial hydrology in the parallel ice sheet model version 0.6. Geosci. Mod. Dev. 8(6), 1613–1635 (2015)
Buntinas, D., Goglin, B., Goodell, D., Mercier, G., Moreaud, S.: Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis, pp. 462–469, September 2009
CDF-5 Format Specifications. http://cucis.ece.northwestern.edu/projects/PnetCDF/cdf5.html. Accessed 11 September 2013
Coloma, K., Choudhary, A., Liao, W.: DAChe: direct access cache system for parallel I/O. In: Proceedings of the 2005 International Supercomputer Conference (2005)
Crandall, P., Aydt, R.A., Chien, A.A., Reed, D.A.: Input/output characteristics of scalable parallel applications. In: Proceedings of Supercomputing 1995 (1995)
Phillip, D., Timothy, M.: Increasing the scalability of PISM for high resolution ice sheet models. In: Workshop on Parallel and Distributed Scientific and Engineering Computing, Boston, May 2013
Dickens, P.M., Thakur, R.: A performance study of two-phase i/o. In: Pritchard, D., Reeve, J.S. (eds.) Euro-Par 1998. LNCS, vol. 1470, pp. 959–965. Springer, Heidelberg (1998)
Documentation for PISM, a parallel Ice Sheet Model. http://pism-docs.org/wiki/doku.php. Accessed 15 May 2015
Feldmann, J., Levermann, A.: Interaction of marine ice-sheet instabilities in two drainage basins: simple scaling of geometry and transition time. Cryosphere 9(2), 631–645 (2015)
Fowler, A.C.: Mathematical Models in the Applied Sciences. Cambridge University Press, Cambridge (1997)
Jin, H.-W., Sur, S., Chai, L., Panda, D.K.: Lightweight kernel-level primitives for high-performance MPI intra-node communication over multi-core systems, pp. 446–451 (2007)
Liao, W., Choudhary, A.: Dynamically adapting file domain partitioning methods for collective i/o based on underlying parallel file system locking protocols. In: Proceedings of the ACM/IEEE Conference on Supercomputing (SC 2008), pp. 313–344 (2008)
Li, J., Liao, W., Choudhary, A., Ross, R., Thakur, R., Latham, R., Siegel, A., Gallagher, B., Zingale, M.: Parallel netCDF: a high-performance scientific I/O interface. In: Proceedings of Supercomputing (2003)
Liu, Q., et al.: Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks: HELLO ADIOS. Concurrency Comput. Pract. Experience 26(7), 1453–1473 (2014)
Ma, A., Califano, F.: The shallow ice approximation for anisotropic ice- formulation and limits. J. Geophys. Res. 103((B1)), 691–705 (1998)
Ma, T., Bosilca, G., Bouteiller, A., Dongarra, J.: HierKNEM: an adaptive framework for kernel-assisted and topology-aware collective communications on many-core clusters. In: 2012 IEEE 26th International Parallel & Distributed Processing Symposium (Ipdps), pp. 970–982, May 2012
Ma, T., Bosilca, G., Bouteiller, A., Dongarra, J.J.: Kernel-assisted and topology-aware MPI collective communications on multicore/many-core platforms. J. Parallel Distrib. Comput. 73(7), 1000–1010 (2013)
Ma, T., Bosilca, G., Bouteiller, A., Goglin, B., Squyres, J.M., Dongarra, J.J.: Kernel assisted collective intra-node MPI communication among multi-core and many-core CPUs. In: 2011 International Conference on Parallel Processing (ICPP), pp. 532–541 (2011)
Mellanox Technologies. https://www.mellanox.com/. Accessed 21 July 2015
Message Passing Interface (MPI) Forum Home Page. http://www.mpi-forum.org/. Accessed 30 August 2013
Moreaud, S., Goglin, B., Namyst, R., Goodell, D.: Optimizing MPI communication within large multicore nodes with kernel assistance. In: IPDPS Workshops, pp. 1–7 (2010)
MPI-2: Extensions to the Message-Passing Interface Message Passing Interface Forum. http://mpi-forum.org/docs/mpi-20-html/mpi2-report.html. Accessed 31 August 2013
Nieuwejaar, N., Kotz, D., Purakayastha, A., Ellis, C.S., Best, M.: File-access characteristics of parallel scientific workloads. IEEE Trans. Parallel Distrib. Syst. 7(10), 1075–1089 (1996)
Parallel HDF5. http://www.hdfgroup.org/HDF5/PHDF5/. Accessed 31 August 2013
PETSc Web page (2015). http://www.mcs.anl.gov/petsc
PISM, a Parallel Ice Sheet Model (2014). http://www.pism-docs.org
PISM, a Parallel Ice Sheet Model: User’s Manual (2015). http://www.pism-docs.org/wiki/lib/exe/fetch.php?media=manual.pdf
SeaRISE Assessment - Interactive System for Ice sheet Simulation. http://websrv.cs.umt.edu/isis/index.php/SeaRISE_Assessment. Accessed 18 May 2015
Texas Advanced Computing Center – Stampede. http://www.tacc.utexas.edu/resources/hpc/stampede. Accessed 30 August 2013
Thakur, R., Gropp, W., Lusk, E.: On implementing mpi-io portably and with high performance. In: Proceedings of the 6th Workshop on I/O in Parallel and Distributed Systems, pp. 23–32 (1999)
Thakur, R., Lusk, E.: An abstract-device interface for implementing portable parallel-i/o interfaces. In: Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computatio, pp. 180–187 (1996)
Thakur, R., Lusk, E.: Data sieving and collective i/o in ROMIO. In: Proceedings of the Seventh Symposium on the Frontiers of Massively Parallel Computation, pp. 182–189 (1998)
The HDF Group - Information, Support, and Software. http://www.hdfgroup.org/. Accessed 11 September 2013
Unidata | Home. http://www.unidata.ucar.edu/. Accessed 11 September 2013
Unidata | NetCDF. http://www.unidata.ucar.edu/software/netcdf/. Accessed 11 September 2013
Unidata | Software. http://www.unidata.ucar.edu/software/. Accessed 11 September 2013
Weis, M., Greve, R., Hutter, K.: Theory of shallow ice shelves. Continuum Mech. Thermo-dyn. 11(1999), 15–50 (1999)
Acknowledgements
This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1053575.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dickens, P. (2015). A Performance and Scalability Analysis of the MPI Based Tools Utilized in a Large Ice Sheet Model Executing in a Multicore Environment. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9531. Springer, Cham. https://doi.org/10.1007/978-3-319-27140-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-27140-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27139-2
Online ISBN: 978-3-319-27140-8
eBook Packages: Computer ScienceComputer Science (R0)