Abstract
This research focuses on two original statistical methods for analyzing large data sets in the context of climate studies. First, we propose a new way to introduce skewness to state-space models without losing the computational advantages of the Kalman filter operations. The motivation stems from the popularity of state-space models and statistical data assimilation techniques in geophysics, specially for forecasting purposes in real time. The added skewness comes from the extension of the multivariate normal distribution to the general multivariate skew-normal distribution. A new specific state-space model for which the Kalman Filtering operations are carefully described is derived. The second part of this work is dedicated to the extension of clustering methods into the distributions of distributions framework. This concept allows us to cluster distributions, instead of simple observations. To illustrate the applicability of such a method, we analyze the distributions of 16200 temperature and humidity vertical profiles. Different levels of dependencies between these distributions are modeled by copulas. The distributions of distributions are decomposed as mixtures and the algorithm to estimate the parameters of such mixtures is presented. Besides providing realistic climatic classes, this clustering method allows atmospheric scientists to explore large climate data sets into a more meaningful and global framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anderson, J. (2001). An ensemble adjustment Kalman filter for data assimilation, Monthly Weather Review, 129: 2884–2903.
Azzalini, A., Dalla Valle A. (1996). The multivariate skew-normal distribution, Biometrika, 83: 715–726.
Azzalini, A., Capitanio, A. (1999). Statistical applications of the multivariate skew normal distribution, J. R. Statist. Soc. B, 61: 579–602.
Bengtsson, T., Nychka, D., Snyder, C. (2002). A frame work for data assimilation and forecasting in high-dimensional non-linear dynamical systems. Submitted to J. R. Statist. Soc. B.
Berk, R., Bickel, P., Campbell, K., Fovell, R., Keller-McNulty, S., Kelly, E., Linn, R., Park, B., Perelson, A., Rouphail, N., Sacks, J., Schoenberg, F. (2002). Workshop on statistical approaches for the evaluation of complex computer models. Statistical Science, 17: 173–192.
Bock, H.H., Diday, E. (2000). Analysis of symbolic data. Exploratory methods for extracting statistical information from complex data, publisher Springer-Verlag, Heidelberg.
Chen, R., Liu, J.S. (2000). Mixture Kalman filters, J. R. Statist. Soc. B, 62: 493–508.
Celeux, G., Govaert, G. (1993). Comparison of the mixture and the classification maximum likelihood in cluster analysis, Journal of statist. computer, 47: 127–146
Chédin, A., Scott, N., Wahiche, C., Moulinier, P. (1985). he improved initialization inversion method: a high resolution physical method for temperature retrievals from satellites of the TIROS-N series, J. Clim. Appl. Meteor., 24: 128–143.
Cressie, N., Wikle, C.K. (2002). Space-time Kalman filter. Entry in Encyclopedia of Environmetrics, 4, eds. A.H. El-Shaarawi and W.W. Piegorsch. Wiley, New York, pp. 2045–2049.
Doucet, A., Freitas, N., Gordon, N. (2001). Sequential Monte Carlo Methods in Practice. Springer.
Diday, E. (2001). A generalisation of the mixture decomposition problem in the symbolic data analysis framework, Cahiers du CEREMADE, 0112.
Diday, E. (1974). The dynamic clusters method in pattern recognition. Proceeding of IFIP, Stockolm.
Domínguez-Molina, González-Farías, G., Gupta, A.K. (2001). A general multivariate skew-normal distribution. Submitted to Math. Methods of Statistics.
Genton, M.G., Loperfido, N. (2002). Generalized skew-elliptical distributions and their quadratic forms. Scandinavian Journal of Statistics. (revised).
Guo, W., Wang, Y., Brown, M. (1999). A signal Extraction Approach to Modeling Hormones Time series with Pulses and a Changing Baseline. J. Amer. Stat. Assoc., vol. 94, 447: 746–756.
Harrison, P.J., Stevens, C.F. (1971). A Bayesian approach to short-term forecasting. Operational Res. Quart. 22: 341–362.
Kitagawa & Gersch (1984). A Smoothness Priors State-Space Modeling of Times Series With Trend and Seasonality. J. Amer. Statist. Assoc. 79: 378–389.
Meinhold, R.J., Singpurwalla, N.D. (1983). Understanding the Kalman filter. The American Statistician. 37: 123–127.
Naveau, P., Genton, M. (2002). The Multivariate General Skewed Kalman Filter. Submitted to the J. of Multi-Variate Analysis.
Naveau, P., Ammann, C.M., Oh, H.S., Guo, W. (2002). A Statistical Methodology to Extract Volcanic Signal in Climatic Times Series. Submitted to the J. of Geophysical Research.
Nelsen, R.B. (1998). An introduction to Copulas, publisher Springer Verlag, Lectures Notes in Statistics.
Shepard, N. (1994). Partially Non-Gaussian State-space Models. Biometrika, 81: 115–131.
Shumway, R.H., Stoffer, D.S. (1991). Dynamic linear models with switching. J. Amer. Statist. Assoc. 86: 763–769.
Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis, publisher Chapman and Hall, London.
Vrac, M., Diday, E., Chédin, A., Naveau, P. (2001). Mélange de distributions de distributions, SFC’2001 8èmes Rencontres de la Société Francophone de Classification, Université des Antilles et de Guyane, Guadeloupe.
Vrac, M. (2002). Analyse et mod’elisation de données probabilistes par Décomposition de Mélange de Copules et Application à une base de données climatologiques, Thèse de doctorat, Université Paris IX Dauphine.
Wikle, C.K., Cressie, N. (1999). A dimension reduced approach to space-time Kalman filtering. Biometrika, 86: 815–829.
Wikle, C.K., Milliff, R.F., Nychka, D., Berliner, L.M. (2001). Spatiotemporal hierarchical Bayesian modeling: Tropical ocean surface winds. JASA, 96: 382–397.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Kluwer Academic Publishers
About this paper
Cite this paper
Naveau, P., Vrac, M., Genton, M.G., Chédin, A., Diday, E. (2004). Two Statistical Methods for Improving the Analysis of Large Climatic Data Sets: General Skewed Kalman Filters and Distributions of Distributions. In: Sanchez-Vila, X., Carrera, J., Gómez-Hernández, J.J. (eds) geoENV IV — Geostatistics for Environmental Applications. Quantitative Geology and Geostatistics, vol 13. Springer, Dordrecht. https://doi.org/10.1007/1-4020-2115-1_1
Download citation
DOI: https://doi.org/10.1007/1-4020-2115-1_1
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-2007-0
Online ISBN: 978-1-4020-2115-2
eBook Packages: Springer Book Archive