A Performance Assessment of the Unified Model
The Unified Model (UM) is a model produced by the UK MetOffice for Numerical Weather Prediction (NWP) and climate simulation. It is used extensively by various university, government and other research organizations on the large supercomputer hosted at the National Computing Infrastructure (NCI). A 3-year collaboration between NCI, the Australian Bureau of Meteorology and Fujitsu is underway to address performance and scalability issues in the UM on NCI’s supercomputer, Raijin.
IO performance in the UM is the most dominant factor in its overall performance. The IO server approach employed is sophisticated and requires proper calibration to achieve acceptable performance. Global synchronization and file lock contention is a problem that can be remedied with simple MPI global collective calls. Complimentary IO strategies, such as MPI-IO and directed IO, are being investigated for implementation.
The OpenMP implementation employed in the UM is investigated, and is found to have inefficiencies that are detrimental to the load balance of the model. Only loop-wise parallelism is employed. Due to the inherently imbalanced nature of the model, a task-wise approach could yield improved threading efficiency.
Keywordsunified model numerical weather prediction performance analysis high performance computing
- 1.Brown, A., Milton, S., Golding, B., Mitchell, J., Shelly, A.: Unified Modeling and Prediction of Weather and Climate A 25-Year Journey. American Meteorological Society, 1865-1877 (December 2012)Google Scholar
- 2.Braam, P.: The Lustre Storage Architecture (2004), ftp://ftp.uni-duisburg.de/pub/linux/filesys/Lustre/lustre.pdf
- 4.Wood, N., Staniforth, A., White, A., Allen, T., Diamantakis, M., Gross, M., Melvin, T., Smith, C., Vosper, S., Zerroukat, M., Thuburn, J.: An inherently mass-conserving semi-implicit semi-Lagrangian discretization of the deep-atmosphere global non-hydrostatic equations. Q.J.R. Meteorol. Soc. 140, 1505–1520 (2014), doi:10.1002/qj.2235CrossRefGoogle Scholar
- 5.Schlutter, M., Philippen, P., Morin, L., Geimer, M., Mohr, B.: Profiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware. In: Bader, M., Bode, A., Bungartz, H.-J., Gerndt, M., Joubert, G.R., Peters, F.J. (eds.) Parallel Computing: Accelerating Computational Science and Engineering (CSE), vol. 25, pp. 773–782. IOS Press (2014)Google Scholar
- 6.Knüpfer, H., Brunst, J., Doleschal, M., Jurenz, M., Lieber, H., Mickler, M., Müller, S., Nagel, W.E.: The Vampir Performance Analysis Tool-Set, pp. 139–155. Springer, Heidelberg (2008)Google Scholar
- 7.Thakur, R., Gropp, W., Lusk, E.: Data sieving and collective I/O in ROMIO. In: The Seventh Symposium on the Frontiers of Massively Parallel Computation, Frontiers 1999, pp. 182–189. IEEE (February 1999)Google Scholar