Abstract
In this paper we propose a spatial latent factor model to deal with multivariate geostatistical skew-normal data. In this model we assume that the unobserved latent structure, responsible for the correlation among different variables as well as for the spatial autocorrelation among different sites is Gaussian, and that the observed variables are skew-normal. For this model we provide some of its properties like its spatial autocorrelation structure and its finite dimensional marginal distributions. Estimation of the unknown parameters of the model is carried out by employing a Monte Carlo Expectation Maximization algorithm, whereas prediction at unobserved sites is performed by using closed form formulas and Markov chain Monte Carlo algorithms. Simulation studies have been performed to evaluate the soundness of the proposed procedures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allard, D., Naveau, P.: A new spatial skew-normal random field model. Commun. Stat. Theory 36, 1821–1834 (2007)
Azzalini, A.: The skew-normal distribution and related multivariate families. Scand. J. Stat. 32, 159–188 (2005)
Azzalini, A., Dalla Valle, A.: The multivariate skew-normal distribution. Biometrika 83, 715–726 (1996)
Bechler, A., Romary, T., Jeannée, N., Desnoyers, Y.: Geostatistical sampling optimization of contaminated facilities. Stoch. Env. Res. Risk A. 27, 1967–1974 (2013)
Brenning, A., Dubois, G.: Towards generic real-time mapping algorithms for environmental monitoring and emergency detection. Stoch. Env. Res. Risk A. 22, 601–611 (2008)
Chagneau, P., Mortier, F., Picard, N., Bacro, J.-N.: A hierarchical Bayesian model for spatial prediction of multivariate non-Gaussian random fields. Biometrics 67, 97–105 (2011)
De, S., Faria, Á.E.: Dynamic spatial Bayesian models for radioactivity deposition. J. Time Ser. Anal. 32, 607–617 (2011)
De Oliveira, V., Kedem, B., Short, D.A.: Bayesian prediction of transformed Gaussian random fields. J. Am. Stat. Assoc. 92 1422–1433 (1997)
Desnoyers, Y., Chilès, J.-P., Dubot, D., Jeannée, N., Idasiak, J.-M.: Geostatistics for radiological evaluation: study of structuring of extreme values. Stoch. Env. Res. Risk A. 25, 1031–1037 (2011)
Diggle, P.J., Moyeed, R.A., Tawn, J.A.: Model-based geostatistics (with discussion). Appl. Stat. 47, 299–350 (1998)
Dubois, G., Galmarini, S.: Spatial interpolation comparison (SIC) 2004: introduction to the exercise and overview of results. In: Dubois, G. (ed.) Automatic Mapping Algorithms for Routine and Emergency Monitoring Data - Spatial Interpolation Comparison 2004, Office for Official Publication of the European Communities (2005)
Fort, G., Moulines, E.: Convergence of the Monte Carlo expectation maximization for curved exponential families. Ann. Stat. 31, 1220–1259 (2003)
González-Farías, G., Domínguez-Molina, J.A., Gupta, A.K.: The closed skew-normal distribution. In: Genton, M.G. (ed.) Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality, pp. 25–42. Chapman & Hall/CRC, London (2004)
González-Farías, G., Domínguez-Molina, J.A., Gupta, A.K.: Additive properties of skew-normal random vectors. J. Stat. Plan. Infer. 126, 521–534 (2004)
Gräler, B.: Modelling skewed spatial random fields through the spatial vine copula. Spat. Stat. (2014). http://dx.doi.org/10.1016/j.spasta.2014.01.001
Herranz, M., Romero, L.M., Idoeta, R., Olondo, C., Valiño, F., Legarda, F.: Inventory and vertical migration of90Sr fallout and137Cs/90Sr ratio in Spanish mainland soils. J. Env. Radioact. 102, 987–994 (2011)
Hosseini, F., Eidsvik, J., Mohammadzadeh, M.: Approximate Bayesian inference in spatial GLMM with skew normal latent variables. Comput. Stat. Data Anal. 55, 1791–1806 (2011)
Kazianka, H., Pilz, J.: Copula-based geostatistical modeling of continuous and discrete data including covariates. Stoch. Env. Res. Risk A. 24, 661–673 (2010)
Kazianka, H., Pilz, J.: Bayesian spatial modeling and interpolation using copulas. Comput. Geosci. 37, 310–319 (2011)
Kim, H.-M., Mallick, B.K.: A Bayesian prediction using the skew Gaussian distribution. J. Stat. Plan. Infer. 120, 85–101 (2004)
Lunn, D., Spiegelhalter, D., Thomas, A., Best, N.: The BUGS project: evolution, critique and future directions. Stat. Med. 28, 3049–3067 (2009)
Maglione, D.S., Diblasi, A.,M.: Exploring a valid model for the variogram of an isotropic spatial process. Stoch. Env. Res. Risk A. 18, 366–376 (2004)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, New York (2007)
Minozzo, M., Ferracuti, L.: On the existence of some skew-normal stationary processes. Chil. J. Stat. 3, 157–170 (2012)
Minozzo, M., Ferrari, C.: Multivariate geostatistical mapping of radioactive contamination in the Maddalena Archipelago (Sardinia, Italy). AStA Adv. Stat. Anal. 97, 195–213 (2013)
Minozzo, M., Fruttini, D.: Loglinear spatial factor analysis: an application to diabetes mellitus complications. Environmetrics 15, 423–434 (2004)
Oliver, M.A., Badr, I.: Determining the spatial scale of variation in soil radon concentration. Math. Geol. 27, 893–922 (1995)
Pilz, J., Spöck, G.: Why do we need and how should we implement Bayesian kriging methods. Stoch. Env. Res. Risk A. 22, 621–632 (2008)
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0 (2012)
Ren, Q., Banerjee, S.: Hierarchical factor models for large spatially misaligned data: a low-rank predictive process approach. Biometrics 69, 19–30 (2013)
Schmidt, A.M., Rodriguez, M.A.: Modelling multivariate counts varying continuously in space. Bayesian Stat. 9 (2011). doi:10.1093/acprof:oso/9780199694587.003.0020
Spöck, G.: Spatial sampling design with skew distributions: the special case of trans-Gaussian kriging. Ninth International Geostatistical Congress, Oslo, Norway, 11–15 June 2012, 20 pages
Wibrin, M., Bogaert, P., Fasbender, D.: Combining categorical and continuous spatial information within the Bayesian maximum entropy paradigm. Stoch. Env. Res. Risk A. 20, 423–433 (2006)
Zhang, H.: On estimation and prediction for spatial generalized linear mixed models. Biometrics 58, 129–136 (2002)
Zhang, H., El-Shaarawi, A.: On spatial skew-Gaussian processes and applications. Environmetrics 21, 33–47 (2010)
Acknowledgements
We gratefully acknowledge funding from the Italian Ministry of Education, University and Research (MIUR) through PRIN 2008 project 2008MRFM2H.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Appendix
Appendix
In this appendix we report some distributional results regarding the observable processes \(Y _{i}\left (\mathbf{x}\right )\). Let us first recall some definitions. Following, for instance, [2], we say that a random vector Y = (Y 1, …, Y n )T has an extended skew-normal distribution with parameters \(\boldsymbol{\mu }\), \(\boldsymbol{\varSigma }\), \(\boldsymbol{\alpha }\) and τ, and we write \(\mathbf{Y} \sim \mbox{ ESN}_{n}(\boldsymbol{\mu },\boldsymbol{\varSigma },\boldsymbol{\alpha },\tau )\), if it has probability density function of the form
where \(\boldsymbol{\mu }\in \mathbb{R}^{n}\) is a vector of location parameters, \(\phi _{n}(\ \cdot \;\boldsymbol{\varSigma })\) is the n-dimensional normal density function with zero mean vector and (positive-definite) variance-covariance matrix \(\boldsymbol{\varSigma }\) having elements σ ij , Φ(⋅ ) is the scalar N(0,1) distribution function, \(\mathbf{D} = \mbox{ diag}(\sigma _{11},\ldots,\sigma _{nn})^{1/2}\) is the diagonal matrix formed with the standard deviations of the scale matrix \(\boldsymbol{\varSigma }\), \(\boldsymbol{\alpha }\in \mathbb{R}^{n}\) is a vector of skewness parameters, and \(\tau \in \mathbb{R}\) is an additional parameter. Moreover, \(\alpha _{0} =\tau (1 +\boldsymbol{\alpha } ^{T}\mathbf{R}\boldsymbol{\alpha })^{1/2}\) where R is the correlation matrix associated to \(\boldsymbol{\varSigma }\), that is, \(\mathbf{R} = \mathbf{D}^{-1}\boldsymbol{\varSigma }\mathbf{D}^{-1}\). Clearly, this distribution extends the multivariate normal distribution through the parameter vector \(\boldsymbol{\alpha }\), and for \(\boldsymbol{\alpha }= 0\) it reduces to the latter. When τ = 0, also α 0 = 0 and (10) reduces to
In this case we simply say that Y has a skew-normal distribution and we write, more concisely, \(\mathbf{Y} \sim \mbox{ SN}_{n}(\boldsymbol{\mu },\boldsymbol{\varSigma },\boldsymbol{\alpha })\).
According to [13] and [14], we say that the n-dimensional random vector Y = (Y 1, …, Y n )T has a multivariate closed skew-normal distribution, and we write \(\mathbf{Y} \sim \mbox{ CSN}_{n,m}(\boldsymbol{\mu },\boldsymbol{\varSigma },\mathbf{D}_{c},\boldsymbol{\nu },\boldsymbol{\varDelta })\), if it has probability density function of the form
where: m is an integer greater than 0; \(\boldsymbol{\mu }\in \mathbb{R}^{n}\); \(\boldsymbol{\varSigma }\in \mathbb{R}^{n\times n}\) is a positive-definite matrix; \(\mathbf{D}_{c} \in \mathbb{R}^{n\times m}\) is an n × m matrix; \(\boldsymbol{\nu }\in \mathbb{R}^{m}\) is a vector; \(\boldsymbol{\varDelta }\in \mathbb{R}^{m\times m}\) is a positive-definite matrix; and \(\phi _{n}(\ \cdot \;\boldsymbol{\mu },\boldsymbol{\varSigma })\) and \(\varPhi _{n}(\ \cdot \;\boldsymbol{\mu },\boldsymbol{\varSigma })\) are the probability density function and the cumulative distribution function, respectively, of the n-dimensional normal distribution with mean vector \(\boldsymbol{\mu }\) and variance-covariance matrix \(\boldsymbol{\varSigma }\).
Though, as we have already noticed, the multivariate finite-dimensional marginal distributions of the multivariate spatial process \(\left (Y _{1}\left (\mathbf{x}\right ),\ldots,Y _{m}\left (\mathbf{x}\right )\right )^{T}\), for \(\mathbf{x} \in \mathbb{R}^{2}\), are not skew-normal (in the sense of [2]), it is possible to show that they are closed skew-normal, according to the definition of [13]. This implies that, for any given i = 1, …, m, each univariate spatial process \(Y _{i}\left (\mathbf{x}\right )\) has all its finite-dimensional marginal distributions that are closed skew-normal. To see this (see also [24]), consider n spatial locations x 1, …, x n , and the corresponding n-dimensional random vector Y = (Y i (x 1), …, Y i (x n ))T. Recalling that for any given \(\mathbf{x} \in \mathbb{R}^{2}\) we can write \(Y _{i}(\mathbf{x}) =\beta _{i} + Z_{i}(\mathbf{x}) +\omega _{i}S_{i}(\mathbf{x})\), the vector Y can be written as \(\mathbf{Y} =\beta _{i}\mathbf{1}_{n} + \mathbf{Z} + \mathbf{D}_{\omega }\mathbf{S} = \mathbf{W} + \mathbf{V}\), where \(\mathbf{W} =\beta _{i}\mathbf{1}_{n} + \mathbf{Z}\), V = D ω S, Z = (Z i (x 1), …, Z i (x n ))T, S = (S i (x 1), …, S i (x n ))T and D ω is the n × n diagonal matrix with ω i on the diagonal. Now, since S i (x), for \(\mathbf{x} \in \mathbb{R}^{2}\), are independently and identically distributed as CSN1, 1(0, 1, α i , 0, 1), according to Theorem 3 of [14], we have that \(\mathbf{S} \sim \mbox{ CSN}_{n,n}(0,\mathbf{I}_{n},\mathbf{D}_{\alpha },0,\mathbf{I}_{n})\), where D α is the n × n diagonal matrix with α i on the diagonal. On the other hand, since Z follows a multivariate normal distribution with mean 0 and covariance matrix \(\boldsymbol{\varSigma }_{Z}\) with entries given by \(\mathrm{Cov}\big[Z_{i}(\mathbf{x}),Z_{i}(\mathbf{x} + \mathbf{h})\big] =\varsigma _{ i}^{2}\rho (\mathbf{h})\), we also have that \(\mathbf{Z} \sim \mbox{ CSN}_{n,1}(0,\boldsymbol{\varSigma }_{Z},0,0,1)\). Moreover, being W distributed as a multivariate normal with mean β i 1 n and covariance matrix \(\boldsymbol{\varSigma }_{Z}\), we can write that \(\mathbf{W} \sim \mbox{ CSN}_{n,1}(\beta _{i}\mathbf{1}_{n},\boldsymbol{\varSigma }_{Z},0,0,1)\), and using Theorem of [14] we can also write that \(\mathbf{V} \sim \mbox{ CSN}_{n,n}(0,\mathbf{D}_{\omega ^{2}},\mathbf{D}_{\alpha /\omega },0,\mathbf{I}_{n})\), where \(\mathbf{D}_{\omega ^{2 }}\) is the n × n diagonal matrix with ω i 2 on the diagonal, and D α∕ω is the n × n diagonal matrix with α i ∕ω i on the diagonal. Thus, considering that \(\mathbf{Y} = \mathbf{W} + \mathbf{V}\), we can conclude, using Theorem 4 of [14], that \(\mathbf{Y} \sim \mbox{ CSN}_{n,n+1}(\beta _{i}\mathbf{1}_{n},\boldsymbol{\varSigma }_{Z} +\omega _{ i}^{2}\mathbf{I}_{n},\mathbf{D}^{{\ast}},0,\boldsymbol{\varDelta }^{{\ast}})\), for some matrices D ∗ and \(\boldsymbol{\varDelta }^{{\ast}}\).
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bagnato, L., Minozzo, M. (2014). A Latent Variable Approach to Modelling Multivariate Geostatistical Skew-Normal Data. In: Carpita, M., Brentari, E., Qannari, E. (eds) Advances in Latent Variables. Studies in Theoretical and Applied Statistics(). Springer, Cham. https://doi.org/10.1007/10104_2014_14
Download citation
DOI: https://doi.org/10.1007/10104_2014_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02966-5
Online ISBN: 978-3-319-02967-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)