Skip to main content

A priori analysis on deep learning of subgrid-scale parameterizations for Kraichnan turbulence


In the present study, we investigate different data-driven parameterizations for large eddy simulation of two-dimensional turbulence in the a priori settings. These models utilize resolved flow field variables on the coarser grid to estimate the subgrid-scale stresses. We use data-driven closure models based on localized learning that employs a multilayer feedforward artificial neural network with point-to-point mapping and neighboring stencil data mapping, and convolutional neural network fed by data snapshots of the whole domain. The performance of these data-driven closure models is measured through a probability density function and is compared with the dynamic Smagorinsky model (DSM). The quantitative performance is evaluated using the cross-correlation coefficient between the true and predicted stresses. We analyze different frameworks in terms of the amount of training data, selection of input and output features, their characteristics in modeling with accuracy, and training and deployment computational time. We also demonstrate computational gain that can be achieved using the intelligent eddy viscosity model that learns eddy viscosity computed by the DSM instead of subgrid-scale stresses. We detail the hyperparameters optimization of these models using the grid search algorithm.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19


  1. 1.

    Durbin, P.A.: Near-wall turbulence closure modeling without “damping functions”. Theor. Comput. Fluid Dyn. 3(1), 1 (1991)

    MathSciNet  MATH  Google Scholar 

  2. 2.

    Launder, B.E., Reece, G.J., Rodi, W.: Progress in the development of a Reynolds-stress turbulence closure. J. Fluid Mech. 68(3), 537 (1975)

    MATH  Google Scholar 

  3. 3.

    Meneveau, C., Katz, J.: Scale-invariance and turbulence models for large-eddy simulation. Annu. Rev. Fluid Mech. 32(1), 1 (2000)

    MathSciNet  MATH  Google Scholar 

  4. 4.

    Mellor, G.L., Yamada, T.: Development of a turbulence closure model for geophysical fluid problems. Rev. Geophys. 20(4), 851 (1982)

    Google Scholar 

  5. 5.

    Bardina, J., Ferziger, J., Reynolds, W.: Improved subgrid-scale models for large-eddy simulation. In: 13th Fluid and Plasma Dynamics Conference, p. 1357 (1980)

  6. 6.

    Rogallo, R.S., Moin, P.: Numerical simulation of turbulent flows. Annu. Rev. Fluid Mech. 16(1), 99 (1984)

    MATH  Google Scholar 

  7. 7.

    Erlebacher, G., Hussaini, M.Y., Speziale, C.G., Zang, T.A.: Toward the large-eddy simulation of compressible turbulent flows. J. Fluid Mech. 238, 155 (1992)

    MATH  Google Scholar 

  8. 8.

    Frisch, U., Kolmogorov, A.N.: Turbulence: The Legacy of AN Kolmogorov. Cambridge University Press, Cambridge (1995)

    Google Scholar 

  9. 9.

    Smagorinsky, J.: General circulation experiments with the primitive equations: I. The basic experiment. Mon. Weather Rev. 91(3), 99 (1963)

    Google Scholar 

  10. 10.

    Deardorff, J.W.: A numerical study of three-dimensional turbulent channel flow at large Reynolds numbers. J. Fluid Mech. 41(2), 453 (1970)

    MATH  Google Scholar 

  11. 11.

    Mcmillan, O., Ferziger, J., Rogallo, R.: Tests of subgrid-scale models in strained turbulence. In: 13th Fluid and Plasma Dynamics Conference, p. 1339 (1980)

  12. 12.

    Mason, P., Callen, N.: On the magnitude of the subgrid-scale eddy coefficient in large-eddy simulations of turbulent channel flow. J. Fluid Mech. 162, 439 (1986)

    MathSciNet  MATH  Google Scholar 

  13. 13.

    Piomelli, U., Moin, P., Ferziger, J.H.: Model consistency in large eddy simulation of turbulent channel flows. Phys. Fluids 31(7), 1884 (1988)

    Google Scholar 

  14. 14.

    Germano, M., Piomelli, U., Moin, P., Cabot, W.H.: A dynamic subgrid-scale eddy viscosity model. Phys. Fluids A 3(7), 1760 (1991)

    MATH  Google Scholar 

  15. 15.

    Lilly, D.K.: A proposed modification of the Germano subgrid-scale closure method. Phys. Fluids A 4(3), 633 (1992)

    Google Scholar 

  16. 16.

    Ghosal, S., Lund, T.S., Moin, P., Akselvoll, K.: A dynamic localization model for large-eddy simulation of turbulent flows. J. Fluid Mech. 286, 229 (1995)

    MathSciNet  MATH  Google Scholar 

  17. 17.

    Meneveau, C., Lund, T.S., Cabot, W.H.: A Lagrangian dynamic subgrid-scale model of turbulence. J. Fluid Mech. 319, 353 (1996)

    MATH  Google Scholar 

  18. 18.

    Park, N., Mahesh, K.: Reduction of the Germano-identity error in the dynamic Smagorinsky model. Phys. Fluids 21(6), 065106 (2009)

    MATH  Google Scholar 

  19. 19.

    Brunton, S.L., Noack, B.R., Koumoutsakos, P.: Machine learning for fluid mechanics. Annu. Rev. Fluid Mech. (2019).

  20. 20.

    Brenner, M., Eldredge, J., Freund, J.: Perspective on machine learning for advancing fluid mechanics. Phys. Rev. Fluids 4(10), 100501 (2019)

    Google Scholar 

  21. 21.

    Kutz, J.N.: Deep learning in fluid dynamics. J. Fluid Mech. 814, 1 (2017)

    MATH  Google Scholar 

  22. 22.

    Milano, Michele, Koumoutsakos, Petros: Neural network modeling for near wall turbulent flow. J. Comput. Phys. 182(1), 1 (2002)

    MATH  Google Scholar 

  23. 23.

    Erichson, N.B., Mathelin, L., Yao, Z., Brunton, S.L., Mahoney, M.W., Kutz, J.N.: Shallow learning for fluid flow reconstruction with limited sensors and limited data. ArXiv preprint arXiv:1902.07358 (2019)

  24. 24.

    Fukami, K., Fukagata, K., Taira, K.: Super-resolution reconstruction of turbulent flows with machine learning. J. Fluid Mech. 870, 106 (2019)

    MathSciNet  Google Scholar 

  25. 25.

    Lee, K., Carlberg, K.: Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders. ArXiv preprint arXiv:1812.08373 (2018)

  26. 26.

    Murata, T., Fukami, K., Fukagata, K.: Nonlinear mode decomposition with convolutional neural networks for fluid dynamics. J. Fluid Mech. 882, A13 (2020)

    MathSciNet  MATH  Google Scholar 

  27. 27.

    Rudy, S.H., Brunton, S.L., Proctor, J.L., Kutz, J.N.: Data-driven discovery of partial differential equations. Sci. Adv. 3(4), e1602614 (2017)

    Google Scholar 

  28. 28.

    Long, Z., Lu, Y., Ma, X., Dong, B.: PDE-net: learning PDEs from data. ArXiv preprint arXiv:1710.09668 (2017)

  29. 29.

    Raissi, M., Karniadakis, G.E.: Hidden physics models: machine learning of nonlinear partial differential equations. J. Comput. Phys. 357, 125 (2018)

    MathSciNet  MATH  Google Scholar 

  30. 30.

    Pathak, J., Hunt, B., Girvan, M., Lu, Z., Ott, E.: Model-free prediction of large spatiotemporally chaotic systems from data: a reservoir computing approach. Phys. Rev. Lett. 120(2), 024102 (2018)

    Google Scholar 

  31. 31.

    Vlachas, P.R., Byeon, W., Wan, Z.Y., Sapsis, T.P., Koumoutsakos, P.: Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks. Proc. R. Soc. A Math. Phys. Eng. Sci. 474(2213), 20170844 (2018)

    MathSciNet  MATH  Google Scholar 

  32. 32.

    Raissi, M., Perdikaris, P., Karniadakis, G.E.: Numerical gaussian processes for time-dependent and nonlinear partial differential equations. SIAM J. Sci. Comput. 40(1), A172 (2018)

    MathSciNet  MATH  Google Scholar 

  33. 33.

    Pawar, S., Rahman, S.M., Vaddireddy, H., San, O., Rasheed, A., Vedula, P.: A deep learning enabler for nonintrusive reduced order modeling of fluid flows. Phys. Fluids 31(8), 085101 (2019)

    Google Scholar 

  34. 34.

    Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686 (2019)

    MathSciNet  MATH  Google Scholar 

  35. 35.

    Erichson, N.B., Muehlebach, M., Mahoney, M.W.: Physics-informed autoencoders for Lyapunov-stable fluid flow prediction. ArXiv preprint arXiv:1905.10866 (2019)

  36. 36.

    Magiera, J., Ray, D., Hesthaven, J.S., Rohde, C.: Constraint-aware neural networks for Riemann problems. ArXiv preprint arXiv:1904.12794 (2019)

  37. 37.

    Ling, J., Kurzawski, A., Templeton, J.: Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. J. Fluid Mech. 807, 155 (2016)

    MathSciNet  MATH  Google Scholar 

  38. 38.

    Wu, J.L., Xiao, H., Paterson, E.: Physics-informed machine learning approach for augmenting turbulence models: a comprehensive framework. Phys. Rev. Fluids 3(7), 074602 (2018)

    Google Scholar 

  39. 39.

    Maulik, R., San, O., Rasheed, A., Vedula, P.: Subgrid modelling for two-dimensional turbulence using neural networks. J. Fluid Mech. 858, 122 (2019)

    MathSciNet  MATH  Google Scholar 

  40. 40.

    Mohebujjaman, M., Rebholz, L.G., Iliescu, T.: Physically constrained data-driven correction for reduced-order modeling of fluid flows. Int. J. Numer. Methods Fluids 89(3), 103 (2019)

    MathSciNet  Google Scholar 

  41. 41.

    Duraisamy, K., Iaccarino, G., Xiao, H.: Turbulence modeling in the age of data. Annu. Rev. Fluid Mech. 51, 357 (2019)

    MathSciNet  MATH  Google Scholar 

  42. 42.

    Lapeyre, C.J., Misdariis, A., Cazard, N., Veynante, D., Poinsot, T.: Training convolutional neural networks to estimate turbulent sub-grid scale reaction rates. Combust. Flame 203, 255 (2019)

    Google Scholar 

  43. 43.

    King, R., Hennigh, O., Mohan, A., Chertkov, M.: From deep to physics-informed learning of turbulence: diagnostics. ArXiv preprint arXiv:1810.07785 (2018)

  44. 44.

    Wang, Z., Luo, K., Li, D., Tan, J., Fan, J.: Investigations of data-driven closure for subgrid-scale stress in large-eddy simulation. Phys. Fluids 30(12), 125101 (2018)

    Google Scholar 

  45. 45.

    Taira, K.: Revealing essential dynamics from high-dimensional fluid flow data and operators. ArXiv preprint arXiv:1903.01913 (2019)

  46. 46.

    Tracey, B., Duraisamy, K., Alonso, J.: Application of supervised learning to quantify uncertainties in turbulence and combustion modeling. In: 51st AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, p. 259 (2013)

  47. 47.

    Tracey, B.D., Duraisamy, K., Alonso, J.J.: A machine learning strategy to assist turbulence model development. In: 53rd AIAA Aerospace Sciences Meeting, p. 1287 (2015)

  48. 48.

    Ling, J., Ruiz, A., Lacaze, G., Oefelein, J.: Uncertainty analysis and data-driven model advances for a jet-in-crossflow. J. Turbomach. 139(2), 021008 (2017)

    Google Scholar 

  49. 49.

    Sarghini, F., De Felice, G., Santini, S.: Neural networks based subgrid scale modeling in large eddy simulations. Comput. Fluids 32(1), 97 (2003)

    MATH  Google Scholar 

  50. 50.

    Pope, S.: A more general effective-viscosity hypothesis. J. Fluid Mech. 72(2), 331 (1975)

    MATH  Google Scholar 

  51. 51.

    Gamahara, M., Hattori, Y.: Searching for turbulence models by artificial neural network. Phys. Rev. Fluids 2(5), 054604 (2017)

    Google Scholar 

  52. 52.

    Wang, J.X., Wu, J.L., Xiao, H.: Physics-informed machine learning approach for reconstructing Reynolds stress modeling discrepancies based on DNS data. Phys. Rev. Fluids 2(3), 034603 (2017)

    Google Scholar 

  53. 53.

    Bhatnagar, S., Afshar, Y., Pan, S., Duraisamy, K., Kaushik, S.: Prediction of aerodynamic flow fields using convolutional neural networks. Comput. Mech. 64, 525–545 (2019)

    MathSciNet  MATH  Google Scholar 

  54. 54.

    Beck, A., Flad, D., Munz, C.D.: Deep neural networks for data-driven LES closure models. J. Comput. Phys. 398, 108910 (2019)

    MathSciNet  Google Scholar 

  55. 55.

    Srinivasan, P., Guastoni, L., Azizpour, H., Schlatter, P., Vinuesa, R.: Predictions of turbulent shear flows using deep neural networks. Phys. Rev. Fluids 4(5), 054603 (2019)

    Google Scholar 

  56. 56.

    Pal, A.: Deep learning parameterization of subgrid scales in wall-bounded turbulent flows. ArXiv preprint arXiv:1905.12765 (2019)

  57. 57.

    Maulik, R., San, O.: A neural network approach for the blind deconvolution of turbulent flows. J. Fluid Mech. 831, 151 (2017)

    MathSciNet  MATH  Google Scholar 

  58. 58.

    Kraichnan, R.H.: The structure of isotropic turbulence at very high Reynolds numbers. J. Fluid Mech. 5(4), 497 (1959)

    MathSciNet  MATH  Google Scholar 

  59. 59.

    Kraichnan, R.H., Montgomery, D.: Two-dimensional turbulence. Rep. Prog. Phys. 43(5), 547 (1980)

    MathSciNet  Google Scholar 

  60. 60.

    Leith, C.: Atmospheric predictability and two-dimensional turbulence. J. Atmos. Sci. 28(2), 145 (1971)

    MATH  Google Scholar 

  61. 61.

    Boffetta, G., Ecke, R.E.: Two-dimensional turbulence. Annu. Rev. Fluid Mech. 44, 427 (2012)

    MathSciNet  MATH  Google Scholar 

  62. 62.

    Kraichnan, R.H.: Inertial ranges in two-dimensional turbulence. Phys. Fluids 10(7), 1417 (1967)

    Google Scholar 

  63. 63.

    Batchelor, G.K.: Computation of the energy spectrum in homogeneous two-dimensional turbulence. Phys. Fluids 12(12), II (1969)

    MATH  Google Scholar 

  64. 64.

    Leonard, A.: Advances in Geophysics, vol. 18, pp. 237–248. Elsevier, Amsterdam (1975)

    Google Scholar 

  65. 65.

    Liu, S., Meneveau, C., Katz, J.: Experimental study of similarity subgrid-scale models of turbulence in the far-field of a jet. Appl. Sci. Res. 54(3), 177 (1995)

    Google Scholar 

  66. 66.

    San, O.: A dynamic eddy-viscosity closure model for large eddy simulations of two-dimensional decaying turbulence. Int. J. Comput. Fluid Dyn. 28(6–10), 363 (2014)

    MathSciNet  Google Scholar 

  67. 67.

    Maulik, R., San, O.: A stable and scale-aware dynamic modeling framework for subgrid-scale parameterizations of two-dimensional turbulence. Comput. Fluids 158, 11 (2017)

    MathSciNet  MATH  Google Scholar 

  68. 68.

    Hagan, M.T., Demuth, H.B., Beale, M.H., De Jesús, O.: Neural Network Design, vol. 20. PWS Pub., Boston (1996)

    Google Scholar 

  69. 69.

    Glorot, X., Bengio, Y.: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

  70. 70.

    Sutskever, I., Martens, J., Dahl, G., Hinton, G.: International Conference on Machine Learning, pp. 1139–1147 (2013)

  71. 71.

    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. ArXiv preprint arXiv:1412.6980 (2014)

  72. 72.

    Ruder, S.: An overview of gradient descent optimization algorithms. ArXiv preprint arXiv:1609.04747 (2016)

  73. 73.

    Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: International Conference on Machine Learning, pp. 1058–1066 (2013)

  74. 74.

    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929 (2014)

    MathSciNet  MATH  Google Scholar 

  75. 75.

    Bartoldson, B.R., Morcos, A.S., Barbu, A., Erlebacher, G.: The generalization-stability tradeoff in neural network pruning. ArXiv preprint arXiv:1906.03728 (2019)

  76. 76.

    Zhu, L., Zhang, W., Kou, J., Liu, Y.: Machine learning methods for turbulence modeling in subsonic flows around airfoils. Phys. Fluids 31(1), 015105 (2019)

    Google Scholar 

  77. 77.

    Xie, C., Wang, J., Li, H., Wan, M., Chen, S.: Artificial neural network mixed model for large eddy simulation of compressible isotropic turbulence. Phys. Fluids 31(8), 085112 (2019)

    Google Scholar 

  78. 78.

    Yang, X., Zafar, S., Wang, J.X., Xiao, H.: Predictive large-eddy-simulation wall modeling via physics-informed neural networks. Phys. Rev. Fluids 4(3), 034602 (2019)

    Google Scholar 

  79. 79.

    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  80. 80.

    Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems, pp. 91–99 (2015)

  81. 81.

    Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1646–1654 (2016)

  82. 82.

    Dong, C., Loy, C.C., Tang, X.: Accelerating the super-resolution convolutional neural network. In: European Conference on Computer Vision. Springer, pp. 391–407 (2016)

  83. 83.

    Hou, W., Darakananda, D., Eldredge, J.: Machine learning based detection of flow disturbances using surface pressure measurements. In: AIAA Scitech 2019 Forum, p. 1148 (2019)

  84. 84.

    Nikolaou, Z.M., Chrysostomou, C., Vervisch, L., Cant, S.: Modelling turbulent premixed flames using convolutional neural networks: application to sub-grid scale variance and filtered reaction rate. ArXiv preprint arXiv:1810.07944 (2018)

  85. 85.

    Nikolaou, Z., Chrysostomou, C., Vervisch, L., Cant, S.: Progress variable variance and filtered rate modelling using convolutional neural networks and flamelet methods. Flow Turbul. Combust. 103, 1–17 (2019)

    Google Scholar 

  86. 86.

    Tabeling, P.: Two-dimensional turbulence: a physicist approach. Phys. Rep. 362(1), 1 (2002)

    MathSciNet  MATH  Google Scholar 

  87. 87.

    Orlandi, P.: Fluid Flow Phenomena: A Numerical Toolkit, vol. 55. Springer, Berlin (2012)

    MATH  Google Scholar 

  88. 88.

    San, O., Staples, A.E.: High-order methods for decaying two-dimensional homogeneous isotropic turbulence. Comput. Fluids 63, 105 (2012)

    MathSciNet  MATH  Google Scholar 

  89. 89.

    Kleissl, J., Kumar, V., Meneveau, C., Parlange, M.B.: Numerical study of dynamic Smagorinsky models in large-eddy simulation of the atmospheric boundary layer: validation in stable and unstable conditions. Water Resour. Res. 42(6), W06D10 (2006)

    Google Scholar 

  90. 90.

    Galperin, B., Orszag, S.A.: Large Eddy Simulation of Complex Engineering and Geophysical Flows. Cambridge University Press, Cambridge (1993)

    Google Scholar 

  91. 91.

    Khani, S., Waite, M.L.: Large eddy simulations of stratified turbulence: the dynamic smagorinsky model. J. Fluid Mech. 773, 327 (2015)

    MATH  Google Scholar 

  92. 92.

    Moin, P., Squires, K., Cabot, W., Lee, S.: A dynamic subgrid-scale model for compressible turbulence and scalar transport. Phys. Fluids A 3(11), 2746 (1991)

    MATH  Google Scholar 

  93. 93.

    Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y.: Spidercnn: deep learning on point sets with parameterized convolutional filters. Proceedings of the European Conference on Computer Vision (ECCV), pp. 87–102 (2018)

  94. 94.

    Trask, N., Patel, R.G., Gross, B.J., Atzberger, P.J.: GMLS-Nets: a framework for learning from unstructured data. ArXiv preprint arXiv:1909.05371 (2019)

  95. 95.

    Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.J.: KPConv: flexible and deformable convolution for point clouds. ArXiv preprint arXiv:1904.08889 (2019)

  96. 96.

    Fey, M., Eric Lenssen, J.,Weichert, F., Müller, H.: SplineCNN: fast geometric deep learning with continuous B-spline kernels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 869–877 (2018)

  97. 97.

    Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

Download references


This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under Award No. DE-SC0019290. Omer San gratefully acknowledges their support.

Author information



Corresponding author

Correspondence to Omer San.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Disclaimer: This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

Communicated by Kunihiko Taira.


Appendix A: Derivation of the Smagorinsky model in 2D turbulence

From Eq. 5, the subgrid-scale stresses in 2D field can be written as

$$\begin{aligned} \tau _{ij}&= \overline{u_i u_j} - {\bar{u}}_i {\bar{u}}_j, \end{aligned}$$
$$\begin{aligned}&= \underbrace{\frac{1}{2}\tau _{kk}\delta _{ij}}_{k_{\mathrm{SGS}}\delta _{ij}} + \bigg (\underbrace{ \tau _{ij} - \frac{1}{2}\tau _{kk}\delta _{ij}}_{\tau _{ij}^d} \bigg ). \end{aligned}$$

The SGS stresses can be written as

$$\begin{aligned} \tau = k_{\mathrm{SGS}}I + \tau ^d, \end{aligned}$$

where \(k_{\mathrm{SGS}}=\frac{1}{2}\tau _{kk}\) is called subgrid-scale kinetic energy (i.e., using the conventional summation notation with repeating indices, for example, \(\tau _{kk} = \tau _{11} + \tau _{22}\), in 2D). In Smagorinsky model, we model the deviatoric (traceless) part of SGS stresses as

$$\begin{aligned} \tau _{ij}^d = -2\nu _\mathrm{e}{\bar{S}}_{ij}^d, \end{aligned}$$

where \(\nu _\mathrm{e}\) is the SGS eddy viscosity, and \({\bar{S}}_{ij}\) is called resolved strain rate tensor given by

$$\begin{aligned} {\bar{S}}_{ij} = \frac{1}{2}\bigg ( \frac{\partial {\bar{u}}_i}{\partial x_j} + \frac{\partial {\bar{u}}_j}{\partial x_i} \bigg ), \end{aligned}$$

where we can write explicitly as follows

$$\begin{aligned} {\bar{S}} = \begin{bmatrix} \frac{\partial {\bar{u}}}{\partial x} &{}\quad \frac{1}{2}\bigg ( \frac{\partial {\bar{u}}}{\partial y} + \frac{\partial {\bar{v}}}{\partial x} \bigg ) \\ \frac{1}{2}\bigg ( \frac{\partial {\bar{v}}}{\partial x} + \frac{\partial {\bar{u}}}{\partial y} \bigg ) &{}\quad \frac{\partial {\bar{v}}}{\partial y} \end{bmatrix} . \end{aligned}$$

The trace of the \({\bar{S}}\) is zero owing to the continuity equation for incompressible flows. Therefore, \({\bar{S}}_{ij}^d = {\bar{S}}_{ij}\) and the Smagorinsky model becomes

$$\begin{aligned} \tau _{ij}^d = -2\nu _\mathrm{e}{\bar{S}}_{ij}. \end{aligned}$$

The eddy viscosity approximation computes \(\nu _\mathrm{e}\) using the following relation

$$\begin{aligned} \nu _\mathrm{e} = C_k \varDelta \sqrt{k_{\mathrm{SGS}}}, \end{aligned}$$

where the proportionality constant is often set to \(C_k = 0.094\), and \(\varDelta \) is the length scale (usually grid size). The SGS kinetic energy \(k_{\mathrm{SGS}}\) is computed with the local equilibrium assumption of the balance between subgrid-scale energy production and dissipation

$$\begin{aligned} {\bar{S}}:\tau + C_{\epsilon }\frac{k_{\mathrm{SGS}}^{1.5}}{\varDelta } = 0, \end{aligned}$$

where the first term in the above equation is dissipation flux, second term is production flux, and the production constant is often set to \(C_{\epsilon }=1.048\). The double inner product operation  :  is given by

$$\begin{aligned} {\bar{S}}:\tau = {\bar{S}}_{ij}\tau _{ij}={\bar{S}}_{11}\tau _{11} + {\bar{S}}_{12}\tau _{12} + {\bar{S}}_{21}\tau _{21} + {\bar{S}}_{22}\tau _{22}. \end{aligned}$$

Substituting Eqs. 35 and 39 into Eq. 41, we get

$$\begin{aligned} {\bar{S}}:(k_{\mathrm{SGS}}I - 2C_k \varDelta \sqrt{k_{\mathrm{SGS}}}{\bar{S}}) + C_{\epsilon }\frac{k_{\mathrm{SGS}}^{1.5}}{\varDelta }&= 0, \end{aligned}$$
$$\begin{aligned} \sqrt{k_{\mathrm{SGS}}}\bigg ( \frac{C_{\epsilon }}{\varDelta }k_{\mathrm{SGS}} + \sqrt{k_{\mathrm{SGS}}} \underbrace{{\bar{S}}:I}_{{\bar{S}}_{ij} \delta _{ij} = 0} - 2C_k \varDelta {\bar{S}}:{\bar{S}} \bigg )&= 0, \end{aligned}$$
$$\begin{aligned} \frac{C_{\epsilon }}{\varDelta }k_{\mathrm{SGS}} - 2C_k \varDelta {\bar{S}}:{\bar{S}}&= 0, \end{aligned}$$

From the above equations, subgrid-scale kinetic energy can be written as

$$\begin{aligned} k_{\mathrm{SGS}}&= \frac{C_k}{C_{\epsilon }}\varDelta ^2(2{\bar{S}}:{\bar{S}}), \end{aligned}$$
$$\begin{aligned} k_{\mathrm{SGS}}&= \frac{C_k}{C_{\epsilon }}\varDelta ^2|{\bar{S}}|^2, \end{aligned}$$

where \(|{\bar{S}}| = \sqrt{2 {\bar{S}}_{ij} {\bar{S}}_{ij}}\). Furthermore, substituting Eq. 40 in the above equation, we get

$$\begin{aligned} \nu _\mathrm{e} = C_k \varDelta ^2 \sqrt{\frac{C_k}{C_\epsilon }}|{\bar{S}}|. \end{aligned}$$

We can define a new constant coefficient as

$$\begin{aligned} C_\mathrm{s}^2 = C_k \sqrt{\frac{C_k}{C_\epsilon }}. \end{aligned}$$

where \(C_\mathrm{s}=0.1678\) is called the Smagorinsky coefficient. Finally, we get following expression for SGS eddy viscosity

$$\begin{aligned} \nu _\mathrm{e} = C_\mathrm{s}^2 \varDelta ^2 |{\bar{S}}|, \end{aligned}$$

and the Smagorinsky model, given by Eq. 36, reads as

$$\begin{aligned} \tau _{ij}^{d} = -2C_\mathrm{s}^2 \varDelta ^2 |{\bar{S}}|{\bar{S}}_{ij}. \end{aligned}$$

Appendix B: Hyperparameters optimization

In appendix, we outline the procedure we followed for selection of hyperparameters for ANN with point-to-point mapping and neighboring stencil mapping. For ANN, there are many hyperparameters such as number of neurons, number of hidden layers, loss function, optimization algorithm, activation function, and batch size, etc. If we use regularization, dropout, or weight decay to avoid overfitting, the design space of hyperparameters increases further.

We focus on three main hyperparameters of ANN: number of neurons, number of hidden layers, and learning rate of optimization algorithm. The training data are scaled between \([-1,1]\) using the minimum and maximum value in the training dataset. We use ReLU activation function given by \(\zeta (\chi ) = \text {max}(0,\chi )\), where \(\zeta \) is the activation function, and \(\chi \) is the input to the node. We use Adam optimization algorithm [71], and the batch size is kept constant at 256. Adam optimization algorithm has three hyperparameters: learning rate \(\alpha \), first moment decay rate \(\beta _1\), and second moment decay rate \(\beta _2\). We test our ANN for two learning rates \(\alpha =0.001\) and 0.0001. The other two hyperparameters in Adam optimization algorithm are \(\beta _1=0.9\) and \(\beta _2=0.999\). We employ mean-squared error as the loss functions, since it is a regression problem. We test both ANN with point-to-point mapping and neighboring stencil mapping for four different number of hidden layers \(L=2,3,5,7\). The ANN with point-to-point mapping is tested for four different number of neurons \(N=20,30,40,50\), and the local stencil mapping is tested for \(N=40,60,80,100\). The number of neurons is higher in case of local stencil mapping because there are more features compared to point-to-point mapping.

The optimal ANN architecture is selected using multi-dimensional gridsearch algorithm coupled with k-fold cross-validation. Cross-validation is a procedure used to determine the performance of the neural network on unseen data. The procedure consists of dividing the training data into k groups, training the ANN by excluding each group and evaluating the model’s performance on that group. Therefore, if we use fivefold cross-validation, then the model is trained five times and the performance index is computed for five groups. Once the performance for each group is available, the mean of the performance index is utilized to select optimal hyperparameters. We use 500 epochs for determining the optimal hyperparameters. A good learning is achieved when both training loss and validation loss reduce till the learning rate is minimal. We apply coefficient of determination \(r^2\) as the performance index to decide optimal hyperparameters. The calculation of coefficient of determination is done using the following formula

$$\begin{aligned} r^2 = 1 - \frac{\sum _{i}(y_i-{\tilde{y}}_i)^2}{\sum _{i}(y_i-{\bar{y}})^2}, \end{aligned}$$

where \(y_i\) is the true label, \({\tilde{y}}\) is the predicated label, and \({\bar{y}}\) is the mean of true labels.

Figure 20 displays the performance index for ANN with point-to-point mapping and \({\mathbb {M}}3\) model for all hyperparameters tested using gridsearch algorithm. It can be observed that the performance of the network does not change significantly with hyperparameters and the difference in performance is very small. The optimal hyperparameters obtained for point-to-point mapping ANN are \(L=2\), \(N=40\), and \(\alpha =0.0001\). We use the same hyperparameters for other two models \({\mathbb {M}}1\) and \({\mathbb {M}}2\) for point-to-point mapping ANN. We see the similar behavior in case of neighboring stencil mapping ANN and model \({\mathbb {M}}3\) as shown in Fig. 21. The optimal hyperparameters for neighboring stencil mapping ANN are \(L=2\), \(N=40\), and \(\alpha =0.001\).

Fig. 20

Hyperparameters search using the gridsearch algorithm combined with fivefold cross-validation for the neural network using point-to-point mapping with \({\mathbb {M}}3\)

Fig. 21

Hyperparameters search using the gridsearch algorithm combined with fivefold cross-validation for the neural network using neighboring stencil mapping with \({\mathbb {M}}3\)

As discussed in Sect. 4.1, we get poor prediction between true and predicted stresses for point-to-point mapping with model \({\mathbb {M}}1\). Figure 22 shows the PDF of true and predicted stresses computed with different activation functions. It can be observed that the predicted stresses are almost the same for all activation functions. Therefore, we can conclude that we need additional input features such as velocity gradients to improve the prediction with point-to-point mapping.

Fig. 22

Probability density function for SGS stress distribution with point-to-point mapping. The ANN is trained using \({\mathbb {M}}1{:}\,\,\{{{\bar{u}},{\bar{v}}}\} \rightarrow \{{\tilde{\tau }}_{11},{\tilde{\tau }}_{12},{\tilde{\tau }}_{22}\}\) with different activation functions. The training set consists of 70 time snapshots from time \(t=0.0\) to \(t=3.5\), and the model is tested for 400th snapshot at \(t=4.0\)

The CNN architecture has similar hyperparameters as the ANN. Additionally, we need to select the kernel shape and strides for CNNs. Stride is the amount by which the kernel should shift as it convolves around the volume. We use the stride = 1 in both x and y directions. We use \(3 \times 3\)-shaped kernel in our CNN architecture. We check the performance of CNN architecture for different number of hidden layers \(L=2,4,6,8\), different number of filters \(N=8,16,24,32\), and two learning rates. Figure 23 displays the performance index of CNN for different hyperparameters. The performance of CNN is more sensitive to the learning rate, and we observe stable performance for the learning rate \(\alpha =0.001\). The performance is almost similar for \(L=6,8,10\) with different number of kernels. We can select \(L=6\) and \(N=16\), which has performance index of 0.76. Additionally, we test the CNN architecture with \(L=6\) and [16, 8, 8, 8, 8, 16] distribution for the number of kernels along hidden layers and we observed the performance index of 0.75 at less computational cost. Therefore, we apply \(L=6\), \(N=[16,8,8,8,8,16]\), and \(\alpha =0.001\) as our hyperparameters for the CNN architecture.

Fig. 23

Hyperparameters search using the gridsearch algorithm combined with fivefold cross-validation for CNN mapping with model \({\mathbb {M}}3\)

Appendix C: CPU time measurements

In this study, the pseudo-spectral solver used for DNS is written in Python programming language. The code for coarsening of variables from fine to coarse grid, dynamic Smagorinsky model code is all written in Python. We use vectorization to get faster computational performance. The machine learning library Keras is also available in Python and is used for developing all data-driven closure models. Therefore, the CPU time reported in our analysis is for codes, which are all developed on the same platform. We would like to highlight that when the trained model is deployed, it makes the function for first time and hence it takes slightly more time. Once the function is created, the CPU time for deployment is less. Therefore, in all our tables, we report the CPU time for running the predict function second time since initializing CUDA kernels might yield a startup overhead as shown in Listing 1, where t1 here has some idle time due to initializing kernels. In our study, we report t2, and we further verified that t3 − t2 = t2, which illustrate that the reported CPU times are consistent.


Appendix D: ANN and CNN architectures

We use open-source Keras library to build our neural networks. It uses TensorFlow at the backend. Keras is widely used for fast prototyping, advanced research, and production due to its simplicity and faster learning rate. Keras library provides different options for optimizers, neural network architectures, activation functions, regularization, dropout, etc. Any simple neural network architecture can be coded with few lines of code. The sample code for ANN and CNN used in this work is listed in Listings 2 and 3.


Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pawar, S., San, O., Rasheed, A. et al. A priori analysis on deep learning of subgrid-scale parameterizations for Kraichnan turbulence. Theor. Comput. Fluid Dyn. 34, 429–455 (2020).

Download citation


  • Turbulence closure
  • Deep learning
  • Neural networks
  • Subgrid-scale modeling
  • Large eddy simulation