Skip to main content

Two Statistical Methods for Improving the Analysis of Large Climatic Data Sets: General Skewed Kalman Filters and Distributions of Distributions

  • Conference paper
geoENV IV — Geostatistics for Environmental Applications

Part of the book series: Quantitative Geology and Geostatistics ((QGAG,volume 13))

  • 564 Accesses

Abstract

This research focuses on two original statistical methods for analyzing large data sets in the context of climate studies. First, we propose a new way to introduce skewness to state-space models without losing the computational advantages of the Kalman filter operations. The motivation stems from the popularity of state-space models and statistical data assimilation techniques in geophysics, specially for forecasting purposes in real time. The added skewness comes from the extension of the multivariate normal distribution to the general multivariate skew-normal distribution. A new specific state-space model for which the Kalman Filtering operations are carefully described is derived. The second part of this work is dedicated to the extension of clustering methods into the distributions of distributions framework. This concept allows us to cluster distributions, instead of simple observations. To illustrate the applicability of such a method, we analyze the distributions of 16200 temperature and humidity vertical profiles. Different levels of dependencies between these distributions are modeled by copulas. The distributions of distributions are decomposed as mixtures and the algorithm to estimate the parameters of such mixtures is presented. Besides providing realistic climatic classes, this clustering method allows atmospheric scientists to explore large climate data sets into a more meaningful and global framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, J. (2001). An ensemble adjustment Kalman filter for data assimilation, Monthly Weather Review, 129: 2884–2903.

    Google Scholar 

  2. Azzalini, A., Dalla Valle A. (1996). The multivariate skew-normal distribution, Biometrika, 83: 715–726.

    Article  Google Scholar 

  3. Azzalini, A., Capitanio, A. (1999). Statistical applications of the multivariate skew normal distribution, J. R. Statist. Soc. B, 61: 579–602.

    Article  Google Scholar 

  4. Bengtsson, T., Nychka, D., Snyder, C. (2002). A frame work for data assimilation and forecasting in high-dimensional non-linear dynamical systems. Submitted to J. R. Statist. Soc. B.

    Google Scholar 

  5. Berk, R., Bickel, P., Campbell, K., Fovell, R., Keller-McNulty, S., Kelly, E., Linn, R., Park, B., Perelson, A., Rouphail, N., Sacks, J., Schoenberg, F. (2002). Workshop on statistical approaches for the evaluation of complex computer models. Statistical Science, 17: 173–192.

    Google Scholar 

  6. Bock, H.H., Diday, E. (2000). Analysis of symbolic data. Exploratory methods for extracting statistical information from complex data, publisher Springer-Verlag, Heidelberg.

    Google Scholar 

  7. Chen, R., Liu, J.S. (2000). Mixture Kalman filters, J. R. Statist. Soc. B, 62: 493–508.

    Google Scholar 

  8. Celeux, G., Govaert, G. (1993). Comparison of the mixture and the classification maximum likelihood in cluster analysis, Journal of statist. computer, 47: 127–146

    Google Scholar 

  9. Chédin, A., Scott, N., Wahiche, C., Moulinier, P. (1985). he improved initialization inversion method: a high resolution physical method for temperature retrievals from satellites of the TIROS-N series, J. Clim. Appl. Meteor., 24: 128–143.

    Google Scholar 

  10. Cressie, N., Wikle, C.K. (2002). Space-time Kalman filter. Entry in Encyclopedia of Environmetrics, 4, eds. A.H. El-Shaarawi and W.W. Piegorsch. Wiley, New York, pp. 2045–2049.

    Google Scholar 

  11. Doucet, A., Freitas, N., Gordon, N. (2001). Sequential Monte Carlo Methods in Practice. Springer.

    Google Scholar 

  12. Diday, E. (2001). A generalisation of the mixture decomposition problem in the symbolic data analysis framework, Cahiers du CEREMADE, 0112.

    Google Scholar 

  13. Diday, E. (1974). The dynamic clusters method in pattern recognition. Proceeding of IFIP, Stockolm.

    Google Scholar 

  14. Domínguez-Molina, González-Farías, G., Gupta, A.K. (2001). A general multivariate skew-normal distribution. Submitted to Math. Methods of Statistics.

    Google Scholar 

  15. Genton, M.G., Loperfido, N. (2002). Generalized skew-elliptical distributions and their quadratic forms. Scandinavian Journal of Statistics. (revised).

    Google Scholar 

  16. Guo, W., Wang, Y., Brown, M. (1999). A signal Extraction Approach to Modeling Hormones Time series with Pulses and a Changing Baseline. J. Amer. Stat. Assoc., vol. 94, 447: 746–756.

    Google Scholar 

  17. Harrison, P.J., Stevens, C.F. (1971). A Bayesian approach to short-term forecasting. Operational Res. Quart. 22: 341–362.

    Google Scholar 

  18. Kitagawa & Gersch (1984). A Smoothness Priors State-Space Modeling of Times Series With Trend and Seasonality. J. Amer. Statist. Assoc. 79: 378–389.

    Google Scholar 

  19. Meinhold, R.J., Singpurwalla, N.D. (1983). Understanding the Kalman filter. The American Statistician. 37: 123–127.

    Google Scholar 

  20. Naveau, P., Genton, M. (2002). The Multivariate General Skewed Kalman Filter. Submitted to the J. of Multi-Variate Analysis.

    Google Scholar 

  21. Naveau, P., Ammann, C.M., Oh, H.S., Guo, W. (2002). A Statistical Methodology to Extract Volcanic Signal in Climatic Times Series. Submitted to the J. of Geophysical Research.

    Google Scholar 

  22. Nelsen, R.B. (1998). An introduction to Copulas, publisher Springer Verlag, Lectures Notes in Statistics.

    Google Scholar 

  23. Shepard, N. (1994). Partially Non-Gaussian State-space Models. Biometrika, 81: 115–131.

    Google Scholar 

  24. Shumway, R.H., Stoffer, D.S. (1991). Dynamic linear models with switching. J. Amer. Statist. Assoc. 86: 763–769.

    Google Scholar 

  25. Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis, publisher Chapman and Hall, London.

    Google Scholar 

  26. Vrac, M., Diday, E., Chédin, A., Naveau, P. (2001). Mélange de distributions de distributions, SFC’2001 8èmes Rencontres de la Société Francophone de Classification, Université des Antilles et de Guyane, Guadeloupe.

    Google Scholar 

  27. Vrac, M. (2002). Analyse et mod’elisation de données probabilistes par Décomposition de Mélange de Copules et Application à une base de données climatologiques, Thèse de doctorat, Université Paris IX Dauphine.

    Google Scholar 

  28. Wikle, C.K., Cressie, N. (1999). A dimension reduced approach to space-time Kalman filtering. Biometrika, 86: 815–829.

    Article  Google Scholar 

  29. Wikle, C.K., Milliff, R.F., Nychka, D., Berliner, L.M. (2001). Spatiotemporal hierarchical Bayesian modeling: Tropical ocean surface winds. JASA, 96: 382–397.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Kluwer Academic Publishers

About this paper

Cite this paper

Naveau, P., Vrac, M., Genton, M.G., Chédin, A., Diday, E. (2004). Two Statistical Methods for Improving the Analysis of Large Climatic Data Sets: General Skewed Kalman Filters and Distributions of Distributions. In: Sanchez-Vila, X., Carrera, J., Gómez-Hernández, J.J. (eds) geoENV IV — Geostatistics for Environmental Applications. Quantitative Geology and Geostatistics, vol 13. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2115-1_1

Download citation

  • DOI: https://doi.org/10.1007/1-4020-2115-1_1

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-2007-0

  • Online ISBN: 978-1-4020-2115-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics