Fast Data-Obtaining Algorithm for Data Assimilation with Large Data Set

Abstract

Data assimilation is an analysis technique which combines observations and the numerical results from theoretical models to deduce more realistic and accurate data. It is widely used in investigations of the atmosphere, ocean and land surface. Due to the complicated data structure of the inputs from dynamical models and the increase of the amount of model data, the parallelization of data assimilation suffers from high overhead on file reading and data communication. In this paper, we propose a flexible parallel data access approach for reading a large number of data from disks firstly. Using this approach, the data access conflict is avoided successfully, and the frequency of disk addressing operations is also decreased significantly. Next, we design a communication-avoiding strategy to reduce the communication volume at the cost of some additional computations. Furthermore, we present a “pipe-flow” scheme for data exchange to conduct conflict-free message passing. Consequently, a fast data-obtaining algorithm is developed for the data assimilation. Our experiments show that the fast data-obtaining algorithm gains a performance of \(5\times \) speedup compared with the baseline, which is excellent at data-obtaining for the parallel data assimilation. Due to the reduction of disk addressing operations, the new approach achieves \(6\times \) speedup on average for the file reading process. Since a large amount of data movement can be avoided, the new approach achieves \(2.7\times \) speedup on average for the communication between processors.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. 1.

    Bei, N., de Foy, B., Lei, W., Zavala, M., Molina, L.T.: Using 3DVAR data assimilation system to improve ozone simulations in the Mexico city basin. Atmos. Chem. Phys. 8(24), 7353–7366 (2008)

    Article  Google Scholar 

  2. 2.

    Bibov, A., Haario, H.: Parallel implementation of data assimilation. Int. J. Numer. Methods Fluids 83(7), 606–622 (2017)

    MathSciNet  Article  Google Scholar 

  3. 3.

    Bleck, R.: An oceanic general circulation model framed in hybrid isopycnic-Cartesian coordinates. Ocean Model. 4(1), 55–88 (2002)

    Article  Google Scholar 

  4. 4.

    Bleck, R., Dean, S., O’Keefe, M., Sawdey, A.: A comparison of data-parallel and message-passing versions of the miami isopycnic coordinate ocean model (micom). Parallel Comput. 21(10), 1695–1720 (1995)

    MATH  Article  Google Scholar 

  5. 5.

    Bloom, S.C., Takacs, L.L., Da Silva, A.M., Ledvina, D.: Data assimilation using incremental analysis updates. Mon. Weather Rev. 124(6), 1256–1271 (1996)

    Article  Google Scholar 

  6. 6.

    Boniface, K., Ducrocq, V., Jaubert, G., Yan, X., Brousseau, P., Masson, F., Champollion, C., Chéry, J., Doerflinger, E.: Impact of high-resolution data assimilation of GPS Zenith delay on Mediterranean heavy rainfall forecasting. Ann. Geophys. 27, 2739–2753 (2009)

    Article  Google Scholar 

  7. 7.

    Chassignet, E.P., Hurlburt, H.E., Smedstad, O.M., Halliwell, G.R., Hogan, P.J., Wallcraft, A.J., Baraille, R., Bleck, R.: The hycom (hybrid coordinate ocean model) data assimilative system. J. Mar. Syst. 65(1–4), 60–83 (2007)

    Article  Google Scholar 

  8. 8.

    Chen, Y., Yan, C., Zhu, J.: Assimilation of sea surface temperature in a global hybrid coordinate ocean model. Adv. Atmos. Sci. 35(10), 1291–1304 (2018)

    Article  Google Scholar 

  9. 9.

    Clayton, A.M., Lorenc, A.C., Barker, D.M.: Operational implementation of a hybrid ensemble/4d-var global data assimilation system at the met office. Quart. J. R. Meteorol. Soc. 139(675), 1445–1461 (2013)

    Article  Google Scholar 

  10. 10.

    Dee, D.P., Uppala, S.M., Simmons, A.J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M.A., Balsamo, G., Bauer, P., et al.: The era-interim reanalysis: configuration and performance of the data assimilation system. Quart. J. Roy. Meteorol. Soc. 137(656), 553–597 (2011)

    Article  Google Scholar 

  11. 11.

    Emerick, A.A., Reynolds, A.C.: Ensemble smoother with multiple data assimilation. Comput. Geosci. 55, 3–15 (2013)

    Article  Google Scholar 

  12. 12.

    Evensen, G.: The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dyn. 53(4), 343–367 (2003)

    Article  Google Scholar 

  13. 13.

    Ghil, M., Malanotte-Rizzoli, P.: Data assimilation in meteorology and oceanography. In: Dmowska, R., Saltzman, B. (eds.) Advances in Geophysics, vol. 33, pp. 141–266. Elsevier, Amsterdam (1991)

  14. 14.

    Houtekamer, P.L., Mitchell, H.L.: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Weather Rev. 129(1), 123–137 (2001)

    Article  Google Scholar 

  15. 15.

    Hunt, B.R., Kostelich, E.J., Szunyogh, I.: Efficient data assimilation for spatiotemporal chaos: a local ensemble transform Kalman filter. Phys. D 230(1), 112–126 (2007)

    MathSciNet  MATH  Article  Google Scholar 

  16. 16.

    Keppenne, C.L.: Data assimilation into a primitive-equation model with a parallel ensemble Kalman filter. Mon. Weather Rev. 128(6), 1971–1981 (2000)

    Article  Google Scholar 

  17. 17.

    Kucharski, F., Molteni, F., Bracco, A.: Decadal interactions between the western tropical pacific and the north atlantic oscillation. Clim. Dyn. 26(1), 79–91 (2006)

    Article  Google Scholar 

  18. 18.

    Large, W.G., McWilliams, J.C., Doney, S.C.: Oceanic vertical mixing: a review and a model with a nonlocal boundary layer parameterization. Rev. Geophys. 32(4), 363–403 (1994)

    Article  Google Scholar 

  19. 19.

    Li, Y., Wang, X., Xue, M.: Assimilation of radar radial velocity data with the WRF hybrid ensemble-3DVAR system for the prediction of Hurricane ike. Mon. Weather Rev. 140(11), 3507–3524 (2012)

    Article  Google Scholar 

  20. 20.

    Molteni, F.: Atmospheric simulations using a GCM with simplified physical parametrizations. I: model climatology and variability in multi-decadal experiments. Clim. Dyn. 20(2), 175–191 (2003)

    Article  Google Scholar 

  21. 21.

    Oke, P.R., Brassington, G.B., Griffin, D.A., Schiller, A.: The bluelink ocean data assimilation system (bodas). Ocean Model. 21(1–2), 46–70 (2008)

    Article  Google Scholar 

  22. 22.

    Ott, E., Hunt, B.R., Szunyogh, I., Zimin, A.V., Kostelich, E.J., Corazza, M., Kalnay, E., Patil, D.J., Yorke, J.A.: A local ensemble Kalman filter for atmospheric data assimilation. Tellus A Dyn. Meteorol. Oceanogr. 56(5), 415–428 (2004)

    Article  Google Scholar 

  23. 23.

    Powell, B.S., Arango, H.G., Moore, A.M., Lorenzo, E.D., Milliff, R.F., Foley, D.: 4DVAR data assimilation in the intra-americas sea with the regional ocean modeling system (ROMS). Ocean Model. 23(3–4), 130–145 (2008)

    Article  Google Scholar 

  24. 24.

    Rodell, M., Houser, P.R., Jambor, U.E.A., Gottschalck, J., Mitchell, K., Meng, C., Arsenault, K., Cosgrove, B., Radakovich, J., Bosilovich, M., et al.: The global land data assimilation system. Bull. Am. Meteorol. Soc. 85(3), 381–394 (2004)

    Article  Google Scholar 

  25. 25.

    Teague, W.J., Carron, M.J., Hogan, P.J.: A comparison between the Generalized Digital Environmental Model and Levitus climatologies. J. Geophys. Res. Oceans 95(C5), 7167–7183 (1990)

    Article  Google Scholar 

  26. 26.

    Wan, W., Xiao, J., Hong, X., Tan, G.: Parallel implementation and optimization of large scale ocean data assimilation algorithm, pp. 1–10. CCF (2018)

  27. 27.

    Xiao, J., Wang, S., Wan, W., Hong, X., Tan, G.: S-enkf: co-designing for scalable ensemble Kalman filter. In: Proceedings of the 24th symposium on principles and practice of parallel programming, pp. 15–26. ACM (2019)

  28. 28.

    Yan, C., Zhu, J., Tanajura, C.A.S.: Impacts of mean dynamic topography on a regional ocean assimilation system. Ocean Sci. 11(5), 829–837 (2015)

    Article  Google Scholar 

  29. 29.

    Zhang, W., Zhu, X., Zhao, J.: Implementation of phase domain decomposition parallel algorithm of three-dimensional variational data assimilation. J. Comput. Res. Dev. 6, 1059–1064 (2005)

    Article  Google Scholar 

  30. 30.

    Zupanski, M.: Theoretical and Practical Issues of Ensemble Data Assimilation in Weather and Climate, pp. 67–84. Springer, Berlin (2009)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank all anonymous reviewers for their valuable comments and helpful suggestions.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Junmin Xiao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work is supported by the National Key Research and Development Program of China under Grant No. 2016YFC1401706 and National Natural Science Foundation of China under Grant No. 61802369.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xiao, J., Zhang, G., Gao, Y. et al. Fast Data-Obtaining Algorithm for Data Assimilation with Large Data Set. Int J Parallel Prog 48, 750–770 (2020). https://doi.org/10.1007/s10766-019-00653-y

Download citation

Keywords

  • Data assimilation
  • I/O optimization
  • Communication optimization
  • Parallel implementation
  • Domain localization