Computational Geosciences

, Volume 22, Issue 1, pp 347–361 | Cite as

Enhancing speed and scalability of the ParFlow simulation code

  • Carsten Burstedde
  • Jose A. Fonseca
  • Stefan Kollet
Original Paper


Regional hydrology studies are often supported by high-resolution simulations of subsurface flow that require expensive and extensive computations. Efficient usage of the latest high performance parallel computing systems becomes a necessity. The simulation software ParFlow has been demonstrated to meet this requirement and shown to have excellent solver scalability for up to 16,384 processes. In the present work, we show that the code requires further enhancements in order to fully take advantage of current petascale machines. We identify ParFlow’s way of parallelization of the computational mesh as a central bottleneck. We propose to reorganize this subsystem using fast mesh partition algorithms provided by the parallel adaptive mesh refinement library p4est. We realize this in a minimally invasive manner by modifying selected parts of the code to reinterpret the existing mesh data structures. We evaluate the scaling performance of the modified version of ParFlow, demonstrating good weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test an example application at large scale.


Subsurface flow Numerical simulation High-performance computing 

Mathematics Subject Classification (2010)

65Y05 65M50 86A05 76S05 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



The development of this work is made possible via the financial support by the collaborative research initiative SFB/TR32 “Patterns in Soil-Vegetation-Atmosphere Systems: Monitoring, Modeling, and Data Assimilation,” project D8, funded by the Deutsche Forschungsgemeinschaft (DFG). Authors B. and F. gratefully acknowledge additional travel support by the Bonn Hausdorff Centre for Mathematics (HCM) also funded by the DFG.

We also would like to thank the Gauss Centre for Supercomputing (GCS) for providing computing time through the John Von Neumann Institute for Computing (NIC) on the GCS share of the supercomputer Juqueen at the Jülich Supercomputing Centre (JSC). GCS is the alliance of the three national supercomputing centers HLRS (Universität Stuttgart), JSC (Forschungszentrum Jülich), and LRZ (Bayerische Akademie der Wissenschaften), funded by the German Federal Ministry of Education and Research (BMBF) and the German State Ministries for Research of Baden-Württemberg (MWK), Bayern (StMWFK), and Nordrhein-Westfalen (MIWF).

Our contributions to the ParFlow code and the scripts defining the test configurations for the numerical experiments exposed in this work are available as open source at


  1. 1.
    Ashby, S.F., Falgout, R.D.: A parallel multigrid preconditioned conjugate gradient algorithm for groundwater flow simulations. Nucl. Sci. Eng. 124(1), 145–159 (1996)CrossRefGoogle Scholar
  2. 2.
    Balay, S., Brown, J., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Smith, B.F., Zhang, H.: PETSC users manual. Tech. Rep. ANL-95/11 - Revision 3.3, Argonne National Laboratory (2012)Google Scholar
  3. 3.
    Burstedde, C., Ghattas, O., Gurnis, M., Isaac, T., Stadler, G., Warburton, T., Wilcox, L.C.: Extreme-scale AMR SC10: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM/IEEE (2010)Google Scholar
  4. 4.
    Burstedde, C., Holke, J., Isaac, T.: Bounds on the number of discontinuities of Morton-type space-filling curves. arXiv:1505.05055 (2017)
  5. 5.
    Burstedde, C., Wilcox, L.C., Ghattas, O.: p4est: scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM J. Sci. Comput. 33(3), 1103–1133 (2011). CrossRefGoogle Scholar
  6. 6.
    Camporese, M., Paniconi, C., Putti, M., Orlandini, S.: Surface-subsurface flow modeling with path-based runoff routing, boundary condition-based coupling, and assimilation of multisource observation data. Water Resour. Res. 46(2) (2010).
  7. 7.
    Dai, Y., Zeng, X., Dickinson, R.E., Baker, I., et al.: The common land model. Bull. Am. Meteorol. Soc. 84(8), 1013 (2003)CrossRefGoogle Scholar
  8. 8.
    Gasper, F., Goergen, K., Shrestha, P., Sulis, M., Rihani, J., Geimer, M., Kollet, S.: Implementation and scaling of the fully coupled terrestrial systems modeling platform (TerrSysMP v1.0) in a massively parallel supercomputing environment – a case study on JUQUEEN (IBM blue gene/Q). Geosci. Model Dev. 7(5), 2531–2543 (2014). CrossRefGoogle Scholar
  9. 9.
    Geimer, M., Wolf, F., Wylie, B., Ábrahám, E., Becker, D., Mohr, B.: The scalasca performance toolset architecture. concurrency and computation: practice and experience 22(6), 702–719 (2010)Google Scholar
  10. 10.
    Hammond, G.E., Lichtner, P.C., Mills, R.T.: Evaluating the performance of parallel subsurface simulators: an illustrative example with PFLOTRAN. Water Resour. Res. 50, 208–228 (2014). CrossRefGoogle Scholar
  11. 11.
    Hardelauf, H., Javaux, M., Herbst, M., Gottschalk, S., Kasteel, R., Vanderborght, J., Vereecken, H.: PARSWMS: a parallelized model for simulating three-dimensional water flow and solute transport in variably saturated soils. Vadose Zone J. 6(2), 255–259 (2007)CrossRefGoogle Scholar
  12. 12.
    Haring, R.A., Ohmacht, M., Fox, T.W., Gschwind, M.K., Satterfield, D.L., Sugavanam, K., Coteus, P.W., Heidelberger, P., Blumrich, M.A., Wisniewski, R.W., et al.: The IBM blue gene/Q compute chip. Micro, IEEE 32(2), 48–60 (2012)CrossRefGoogle Scholar
  13. 13.
    Hindmarsh, A.C., Brown, P.N., Grant, K.E., Lee, S.L., Serban, R., Shumaker, D.E., Woodward, C.S.: SUNDIALS: suite Of nonlinear and differential/algebraic equation solvers. ACM Trans. Math. Softw. (TOMS) 31(3), 363–396 (2005)CrossRefGoogle Scholar
  14. 14.
    Hwang, H.T., Park, Y.J., Sudicky, E., Forsyth, P.: A parallel computational framework to solve flow and transport in integrated surface–subsurface hydrologic systems. Environ. Model Softw. 61, 39–58 (2014). CrossRefGoogle Scholar
  15. 15.
    Isaac, T., Burstedde, C., Wilcox, L. C., Ghattas, O.: Recursive algorithms for distributed forests of octrees. SIAM J. Sci. Comput. 37(5), C497–C531 (2015). CrossRefGoogle Scholar
  16. 16.
    Jones, J.E., Woodward, C.S.: Newton-Krylov-multigrid solvers for large-scale, highly heterogeneous, variably saturated flow problems. Adv. Water Resour. 24(7), 763–774 (2001). CrossRefGoogle Scholar
  17. 17.
    Jülich Supercomputing Centre: JUQUEEN: IBM Blue Gene/Q Supercomputer system at the jülich Supercomputing Centre. Journal of Large-Scale Research Facilities A1 (2015).
  18. 18.
    Karypis, G., Kumar, V.: METIS – unstructured graph partitioning and sparse matrix ordering system. Version 2.0 (1995)Google Scholar
  19. 19.
    Karypis, G., Kumar, V.: A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput. 48, 71–95 (1998)CrossRefGoogle Scholar
  20. 20.
    Kollet, S.J., Maxwell, R.M.: Integrated surface-groundwater flow modeling: a free-surface overland flow boundary condition in a parallel groundwater flow model. Adv. Water Resour. 29, 945–958 (2006)CrossRefGoogle Scholar
  21. 21.
    Kollet, S.J., Maxwell, R.M.: Capturing the influence of groundwater dynamics on land surface processes using an integrated, distributed watershed model. Water Resour. Res. 44(2) (2008).
  22. 22.
    Kollet, S.J., Maxwell, R.M., Woodward, C.S., Smith, S., Vanderborght, J., Vereecken, H., Simmer, C.: Proof of concept of regional scale hydrologic simulations at hydrologic resolution utilizing massively parallel computer resources. Water Resour. Res. 46, W04,201 (2010). CrossRefGoogle Scholar
  23. 23.
    Kuznetsov, M., Yakirevich, A., Pachepsky, Y., Sorek, S., Weisbrod, N.: Quasi 3d modeling of water flow in vadose zone and groundwater. J. Hydrol. 450–451, 140–149 (2012). CrossRefGoogle Scholar
  24. 24.
    Liu, X.: Parallel modeling of three-dimensional variably saturated ground water flows with unstructured mesh using open source finite volume platform openfoam. Engineering Applications of Computational Fluid Mechanics 7 (2), 223–238 (2013). CrossRefGoogle Scholar
  25. 25.
    Maxwell, R.M.: A terrain-following grid transform and preconditioner for parallel, large-scale, integrated hydrologic modeling. Adv. Water Resour. 53, 109–117 (2013). CrossRefGoogle Scholar
  26. 26.
    Miller, C.T., Dawson, C.N., Farthing, M.W., Hou, T.Y., Huang, J., Kees, C.E., Kelley, C., Langtangen, H.P.: Numerical simulation of water resources problems: models, methods, and trends. Adv. Water Resour. 51, 405–437 (2013)., 35th Year Anniversary IssueCrossRefGoogle Scholar
  27. 27.
    Müller, A., Kopera, M.A., Marras, S., Wilcox, L.C., Isaac, T., Giraldo, F.X.: Strong scaling for numerical weather prediction at petascale with the atmospheric model NUMA. arXiv:1511.01561 (2015)
  28. 28.
    Muskat, M.: Physical principles of oil production. IHRDC, Boston, MA (1981)Google Scholar
  29. 29.
    OpenCFD: OpenFOAM – the open source CFD toolbox—user’s guide. OpenCFD Ltd., United Kingdom. 1.4 edn (2007)Google Scholar
  30. 30.
    Orgogozo, L., Renon, N., Soulaine, C., Hénon, F., Tomer, S., Labat, D., Pokrovsky, O., Sekhar, M., Ababou, R., Quintard, M.: An open source massively parallel solver for richards equation: mechanistic modelling of water fluxes at the watershed scale. Comput. Phys. Commun. 185(12), 3358–3371 (2014). CrossRefGoogle Scholar
  31. 31.
    Osei-Kuffuor, D., Maxwell, R., Woodward, C.: Improved numerical solvers for implicit coupling of subsurface and overland flow. Adv. Water Resour. 74, 185–195 (2014)CrossRefGoogle Scholar
  32. 32.
    Performance applications programming interface (PAPI). Last Accessed September 7, 2017
  33. 33.
    Richards, L.A.: Capillary conduction of liquids through porous media. Physics 1, 318–33 (1931)CrossRefGoogle Scholar
  34. 34.
    Rudi, J., Malossi, A.C.I., Isaac, T., Stadler, G., Gurnis, M., Staar, P.W.J., Ineichen, Y., Bekas, C., Curioni, A., Ghattas, O.: An extreme-scale implicit solver for complex PDEs: highly heterogeneous flow in earth’s mantle Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, p. 5. ACM (2015)Google Scholar
  35. 35.
    Saviankou, P., Knobloch, M., Visser, A., Mohr, B.: Cube v4: from performance report explorer to performance analysis tool. Procedia Computer Science 51, 1343–1352 (2015). CrossRefGoogle Scholar
  36. 36.
    The Hypre Team: hypre – high performance preconditioners users manual. Center for applied scientific computing, lawrence livermore national laboratory. Software version 2.0.9b (2012)Google Scholar
  37. 37.
    Tompson, A.F.B., Ababou, R., Gelhar, L.W.: Implementation of the three-dimensional turning bands random field generator. Water Resour. Res. 25(10), 2227–2243 (1989). CrossRefGoogle Scholar
  38. 38.
    Tuminaro, R.S., Heroux, M., Hutchinson, S.A., Shadid, J.N.: Official Aztec User’s Guide. Sandia National Laboratories, sand99-8801j edn (1999)Google Scholar
  39. 39.
    Yamamoto, H., Zhang, K., Karasaki, K., Marui, A., Uehara, H., Nishikawa, N.: Numerical investigation concerning the impact of CO2 geologic storage on regional groundwater flow. Int. J. Greenhouse Gas Control 3(5), 586–599 (2009). CrossRefGoogle Scholar
  40. 40.
    Zhang, K., Wu, Y.S., Pruess, K.: User’s guide for TOUGH2-MP a massively parallel version of the TOUGH2 code. Lawrence Berkeley National Laboratory. Report LBNL-315E (2008)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Carsten Burstedde
    • 1
  • Jose A. Fonseca
    • 1
  • Stefan Kollet
    • 2
  1. 1.Institut für Numerische Simulation and Hausdorff Center for MathematicsRheinische Friedrich-Wilhelms-Universität BonnBonnGermany
  2. 2.Agrosphere (IGB-3) Forschungszentrum Jülich GmbH and Centre for High-Performance Scientific Computing in Terrestrial SystemsGeoverbund ABC/JJülichGermany

Personalised recommendations