A Latent Variable Approach to Modelling Multivariate Geostatistical Skew-Normal Data

Bagnato, Luca; Minozzo, Marco

doi:10.1007/10104_2014_14

Luca Bagnato⁹ &
Marco Minozzo¹⁰

Part of the book series: Studies in Theoretical and Applied Statistics ((STASSPSS))

1674 Accesses
3 Citations

Abstract

In this paper we propose a spatial latent factor model to deal with multivariate geostatistical skew-normal data. In this model we assume that the unobserved latent structure, responsible for the correlation among different variables as well as for the spatial autocorrelation among different sites is Gaussian, and that the observed variables are skew-normal. For this model we provide some of its properties like its spatial autocorrelation structure and its finite dimensional marginal distributions. Estimation of the unknown parameters of the model is carried out by employing a Monte Carlo Expectation Maximization algorithm, whereas prediction at unobserved sites is performed by using closed form formulas and Markov chain Monte Carlo algorithms. Simulation studies have been performed to evaluate the soundness of the proposed procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Allard, D., Naveau, P.: A new spatial skew-normal random field model. Commun. Stat. Theory 36, 1821–1834 (2007)
Article MATH MathSciNet Google Scholar
Azzalini, A.: The skew-normal distribution and related multivariate families. Scand. J. Stat. 32, 159–188 (2005)
Article MATH MathSciNet Google Scholar
Azzalini, A., Dalla Valle, A.: The multivariate skew-normal distribution. Biometrika 83, 715–726 (1996)
Article MATH MathSciNet Google Scholar
Bechler, A., Romary, T., Jeannée, N., Desnoyers, Y.: Geostatistical sampling optimization of contaminated facilities. Stoch. Env. Res. Risk A. 27, 1967–1974 (2013)
Article Google Scholar
Brenning, A., Dubois, G.: Towards generic real-time mapping algorithms for environmental monitoring and emergency detection. Stoch. Env. Res. Risk A. 22, 601–611 (2008)
Article MathSciNet Google Scholar
Chagneau, P., Mortier, F., Picard, N., Bacro, J.-N.: A hierarchical Bayesian model for spatial prediction of multivariate non-Gaussian random fields. Biometrics 67, 97–105 (2011)
Article MATH MathSciNet Google Scholar
De, S., Faria, Á.E.: Dynamic spatial Bayesian models for radioactivity deposition. J. Time Ser. Anal. 32, 607–617 (2011)
Article MATH MathSciNet Google Scholar
De Oliveira, V., Kedem, B., Short, D.A.: Bayesian prediction of transformed Gaussian random fields. J. Am. Stat. Assoc. 92 1422–1433 (1997)
MATH Google Scholar
Desnoyers, Y., Chilès, J.-P., Dubot, D., Jeannée, N., Idasiak, J.-M.: Geostatistics for radiological evaluation: study of structuring of extreme values. Stoch. Env. Res. Risk A. 25, 1031–1037 (2011)
Article Google Scholar
Diggle, P.J., Moyeed, R.A., Tawn, J.A.: Model-based geostatistics (with discussion). Appl. Stat. 47, 299–350 (1998)
MATH MathSciNet Google Scholar
Dubois, G., Galmarini, S.: Spatial interpolation comparison (SIC) 2004: introduction to the exercise and overview of results. In: Dubois, G. (ed.) Automatic Mapping Algorithms for Routine and Emergency Monitoring Data - Spatial Interpolation Comparison 2004, Office for Official Publication of the European Communities (2005)
Google Scholar
Fort, G., Moulines, E.: Convergence of the Monte Carlo expectation maximization for curved exponential families. Ann. Stat. 31, 1220–1259 (2003)
Article MATH MathSciNet Google Scholar
González-Farías, G., Domínguez-Molina, J.A., Gupta, A.K.: The closed skew-normal distribution. In: Genton, M.G. (ed.) Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality, pp. 25–42. Chapman & Hall/CRC, London (2004)
Google Scholar
González-Farías, G., Domínguez-Molina, J.A., Gupta, A.K.: Additive properties of skew-normal random vectors. J. Stat. Plan. Infer. 126, 521–534 (2004)
Article MATH Google Scholar
Gräler, B.: Modelling skewed spatial random fields through the spatial vine copula. Spat. Stat. (2014). http://dx.doi.org/10.1016/j.spasta.2014.01.001
Herranz, M., Romero, L.M., Idoeta, R., Olondo, C., Valiño, F., Legarda, F.: Inventory and vertical migration of⁹⁰Sr fallout and¹³⁷Cs/⁹⁰Sr ratio in Spanish mainland soils. J. Env. Radioact. 102, 987–994 (2011)
Article Google Scholar
Hosseini, F., Eidsvik, J., Mohammadzadeh, M.: Approximate Bayesian inference in spatial GLMM with skew normal latent variables. Comput. Stat. Data Anal. 55, 1791–1806 (2011)
Article MathSciNet Google Scholar
Kazianka, H., Pilz, J.: Copula-based geostatistical modeling of continuous and discrete data including covariates. Stoch. Env. Res. Risk A. 24, 661–673 (2010)
Article Google Scholar
Kazianka, H., Pilz, J.: Bayesian spatial modeling and interpolation using copulas. Comput. Geosci. 37, 310–319 (2011)
Article Google Scholar
Kim, H.-M., Mallick, B.K.: A Bayesian prediction using the skew Gaussian distribution. J. Stat. Plan. Infer. 120, 85–101 (2004)
Google Scholar
Lunn, D., Spiegelhalter, D., Thomas, A., Best, N.: The BUGS project: evolution, critique and future directions. Stat. Med. 28, 3049–3067 (2009)
Article MathSciNet Google Scholar
Maglione, D.S., Diblasi, A.,M.: Exploring a valid model for the variogram of an isotropic spatial process. Stoch. Env. Res. Risk A. 18, 366–376 (2004)
Google Scholar
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, New York (2007)
Google Scholar
Minozzo, M., Ferracuti, L.: On the existence of some skew-normal stationary processes. Chil. J. Stat. 3, 157–170 (2012)
MathSciNet Google Scholar
Minozzo, M., Ferrari, C.: Multivariate geostatistical mapping of radioactive contamination in the Maddalena Archipelago (Sardinia, Italy). AStA Adv. Stat. Anal. 97, 195–213 (2013)
Article MathSciNet Google Scholar
Minozzo, M., Fruttini, D.: Loglinear spatial factor analysis: an application to diabetes mellitus complications. Environmetrics 15, 423–434 (2004)
Article Google Scholar
Oliver, M.A., Badr, I.: Determining the spatial scale of variation in soil radon concentration. Math. Geol. 27, 893–922 (1995)
Article Google Scholar
Pilz, J., Spöck, G.: Why do we need and how should we implement Bayesian kriging methods. Stoch. Env. Res. Risk A. 22, 621–632 (2008)
Article Google Scholar
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0 (2012)
Google Scholar
Ren, Q., Banerjee, S.: Hierarchical factor models for large spatially misaligned data: a low-rank predictive process approach. Biometrics 69, 19–30 (2013)
Article MATH MathSciNet Google Scholar
Schmidt, A.M., Rodriguez, M.A.: Modelling multivariate counts varying continuously in space. Bayesian Stat. 9 (2011). doi:10.1093/acprof:oso/9780199694587.003.0020
Google Scholar
Spöck, G.: Spatial sampling design with skew distributions: the special case of trans-Gaussian kriging. Ninth International Geostatistical Congress, Oslo, Norway, 11–15 June 2012, 20 pages
Google Scholar
Wibrin, M., Bogaert, P., Fasbender, D.: Combining categorical and continuous spatial information within the Bayesian maximum entropy paradigm. Stoch. Env. Res. Risk A. 20, 423–433 (2006)
Article MathSciNet Google Scholar
Zhang, H.: On estimation and prediction for spatial generalized linear mixed models. Biometrics 58, 129–136 (2002)
Article MATH MathSciNet Google Scholar
Zhang, H., El-Shaarawi, A.: On spatial skew-Gaussian processes and applications. Environmetrics 21, 33–47 (2010)
MATH MathSciNet Google Scholar

Download references

Acknowledgements

We gratefully acknowledge funding from the Italian Ministry of Education, University and Research (MIUR) through PRIN 2008 project 2008MRFM2H.

Author information

Authors and Affiliations

Università Cattolica del Sacro Cuore, Milano, Italy
Luca Bagnato
Dipartimento di Scienze Economiche, Università degli Studi di Verona, Via dell’Artigliere 19, 37129, Verona, Italy
Marco Minozzo

Authors

Luca Bagnato
View author publications
You can also search for this author in PubMed Google Scholar
Marco Minozzo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Luca Bagnato or Marco Minozzo .

Editor information

Editors and Affiliations

Dept. of Economics and Management, University of Brescia, Brescia, Italy
Maurizio Carpita
Dept. of Economics and Management, University of Brescia, Brescia, Italy
Eugenio Brentari
Dept. of Chemometrics and Sensometrics, Oniris Nantes National College, Nantes, France
El Mostafa Qannari

Appendix

In this appendix we report some distributional results regarding the observable processes $Y _{i}\left (\mathbf{x}\right )$. Let us first recall some definitions. Following, for instance, [2], we say that a random vector Y = (Y ₁, …, Y _n)^T has an extended skew-normal distribution with parameters $\boldsymbol{\mu }$, $\boldsymbol{\varSigma }$, $\boldsymbol{\alpha }$ and τ, and we write $\mathbf{Y} \sim \mbox{ ESN}_{n}(\boldsymbol{\mu },\boldsymbol{\varSigma },\boldsymbol{\alpha },\tau )$, if it has probability density function of the form

$$\displaystyle{f(\mathbf{y}) =\phi _{n}(\mathbf{y}-\boldsymbol{\mu };\boldsymbol{\varSigma }) \cdot \varPhi (\alpha _{0} +\boldsymbol{\alpha } ^{T}\mathbf{D}^{-1}(\mathbf{y}-\boldsymbol{\mu }))/\varPhi (\tau ),\ \ \ \ \mbox{ for}\ \ \ \mathbf{y} \in \mathbb{R}^{n},}$$

(10)

where $\boldsymbol{\mu }\in \mathbb{R}^{n}$ is a vector of location parameters, $\phi _{n}(\ \cdot \;\boldsymbol{\varSigma })$ is the n-dimensional normal density function with zero mean vector and (positive-definite) variance-covariance matrix $\boldsymbol{\varSigma }$ having elements σ _ij, Φ(⋅ ) is the scalar N(0,1) distribution function, $\mathbf{D} = \mbox{ diag}(\sigma _{11},\ldots,\sigma _{nn})^{1/2}$ is the diagonal matrix formed with the standard deviations of the scale matrix $\boldsymbol{\varSigma }$, $\boldsymbol{\alpha }\in \mathbb{R}^{n}$ is a vector of skewness parameters, and $\tau \in \mathbb{R}$ is an additional parameter. Moreover, $\alpha _{0} =\tau (1 +\boldsymbol{\alpha } ^{T}\mathbf{R}\boldsymbol{\alpha })^{1/2}$ where R is the correlation matrix associated to $\boldsymbol{\varSigma }$, that is, $\mathbf{R} = \mathbf{D}^{-1}\boldsymbol{\varSigma }\mathbf{D}^{-1}$. Clearly, this distribution extends the multivariate normal distribution through the parameter vector $\boldsymbol{\alpha }$, and for $\boldsymbol{\alpha }= 0$ it reduces to the latter. When τ = 0, also α ₀ = 0 and (10) reduces to

$$\displaystyle{ f(\mathbf{y}) = 2 \cdot \phi _{n}(\mathbf{y}-\boldsymbol{\mu };\boldsymbol{\varSigma }) \cdot \varPhi (\boldsymbol{\alpha }^{T}\mathbf{D}^{-1}(\mathbf{y}-\boldsymbol{\mu })),\ \ \ \ \mbox{ for}\ \ \ \mathbf{y} \in \mathbb{R}^{n}. }$$

(11)

In this case we simply say that Y has a skew-normal distribution and we write, more concisely, $\mathbf{Y} \sim \mbox{ SN}_{n}(\boldsymbol{\mu },\boldsymbol{\varSigma },\boldsymbol{\alpha })$.

According to [13] and [14], we say that the n-dimensional random vector Y = (Y ₁, …, Y _n)^T has a multivariate closed skew-normal distribution, and we write $\mathbf{Y} \sim \mbox{ CSN}_{n,m}(\boldsymbol{\mu },\boldsymbol{\varSigma },\mathbf{D}_{c},\boldsymbol{\nu },\boldsymbol{\varDelta })$, if it has probability density function of the form

$$\displaystyle{ f(\mathbf{y}) = \frac{1} {\varPhi _{m}(\mathbf{0};\boldsymbol{\nu },\boldsymbol{\varDelta }+\mathbf{D}_{c}^{T}\boldsymbol{\varSigma }\mathbf{D}_{c})} \cdot \phi _{n}(\mathbf{y};\boldsymbol{\mu },\boldsymbol{\varSigma }) \cdot \varPhi _{m}(\mathbf{D}_{c}^{T}(\mathbf{y}-\boldsymbol{\mu });\boldsymbol{\nu },\boldsymbol{\varDelta }),\ \ \ \ \mbox{ for}\ \ \ \mathbf{y} \in \mathbb{R}^{n}, }$$

(12)

where: m is an integer greater than 0; $\boldsymbol{\mu }\in \mathbb{R}^{n}$; $\boldsymbol{\varSigma }\in \mathbb{R}^{n\times n}$ is a positive-definite matrix; $\mathbf{D}_{c} \in \mathbb{R}^{n\times m}$ is an n × m matrix; $\boldsymbol{\nu }\in \mathbb{R}^{m}$ is a vector; $\boldsymbol{\varDelta }\in \mathbb{R}^{m\times m}$ is a positive-definite matrix; and $\phi _{n}(\ \cdot \;\boldsymbol{\mu },\boldsymbol{\varSigma })$ and $\varPhi _{n}(\ \cdot \;\boldsymbol{\mu },\boldsymbol{\varSigma })$ are the probability density function and the cumulative distribution function, respectively, of the n-dimensional normal distribution with mean vector $\boldsymbol{\mu }$ and variance-covariance matrix $\boldsymbol{\varSigma }$.

Though, as we have already noticed, the multivariate finite-dimensional marginal distributions of the multivariate spatial process $\left (Y _{1}\left (\mathbf{x}\right ),\ldots,Y _{m}\left (\mathbf{x}\right )\right )^{T}$, for $\mathbf{x} \in \mathbb{R}^{2}$, are not skew-normal (in the sense of [2]), it is possible to show that they are closed skew-normal, according to the definition of [13]. This implies that, for any given i = 1, …, m, each univariate spatial process $Y _{i}\left (\mathbf{x}\right )$ has all its finite-dimensional marginal distributions that are closed skew-normal. To see this (see also [24]), consider n spatial locations x ₁, …, x _n, and the corresponding n-dimensional random vector Y = (Y _i(x ₁), …, Y _i(x _n))^T. Recalling that for any given $\mathbf{x} \in \mathbb{R}^{2}$ we can write $Y _{i}(\mathbf{x}) =\beta _{i} + Z_{i}(\mathbf{x}) +\omega _{i}S_{i}(\mathbf{x})$, the vector Y can be written as $\mathbf{Y} =\beta _{i}\mathbf{1}_{n} + \mathbf{Z} + \mathbf{D}_{\omega }\mathbf{S} = \mathbf{W} + \mathbf{V}$, where $\mathbf{W} =\beta _{i}\mathbf{1}_{n} + \mathbf{Z}$, V = D _ω S, Z = (Z _i(x ₁), …, Z _i(x _n))^T, S = (S _i(x ₁), …, S _i(x _n))^T and D _ω is the n × n diagonal matrix with ω _i on the diagonal. Now, since S _i(x), for $\mathbf{x} \in \mathbb{R}^{2}$, are independently and identically distributed as CSN_1, 1(0, 1, α _i, 0, 1), according to Theorem 3 of [14], we have that $\mathbf{S} \sim \mbox{ CSN}_{n,n}(0,\mathbf{I}_{n},\mathbf{D}_{\alpha },0,\mathbf{I}_{n})$, where D _α is the n × n diagonal matrix with α _i on the diagonal. On the other hand, since Z follows a multivariate normal distribution with mean 0 and covariance matrix $\boldsymbol{\varSigma }_{Z}$ with entries given by $\mathrm{Cov}\big[Z_{i}(\mathbf{x}),Z_{i}(\mathbf{x} + \mathbf{h})\big] =\varsigma _{ i}^{2}\rho (\mathbf{h})$, we also have that $\mathbf{Z} \sim \mbox{ CSN}_{n,1}(0,\boldsymbol{\varSigma }_{Z},0,0,1)$. Moreover, being W distributed as a multivariate normal with mean β _i 1 _n and covariance matrix $\boldsymbol{\varSigma }_{Z}$, we can write that $\mathbf{W} \sim \mbox{ CSN}_{n,1}(\beta _{i}\mathbf{1}_{n},\boldsymbol{\varSigma }_{Z},0,0,1)$, and using Theorem of [14] we can also write that $\mathbf{V} \sim \mbox{ CSN}_{n,n}(0,\mathbf{D}_{\omega ^{2}},\mathbf{D}_{\alpha /\omega },0,\mathbf{I}_{n})$, where $\mathbf{D}_{\omega ^{2 }}$ is the n × n diagonal matrix with ω _i ² on the diagonal, and D _α∕ω is the n × n diagonal matrix with α _i∕ω _i on the diagonal. Thus, considering that $\mathbf{Y} = \mathbf{W} + \mathbf{V}$, we can conclude, using Theorem 4 of [14], that $\mathbf{Y} \sim \mbox{ CSN}_{n,n+1}(\beta _{i}\mathbf{1}_{n},\boldsymbol{\varSigma }_{Z} +\omega _{ i}^{2}\mathbf{I}_{n},\mathbf{D}^{{\ast}},0,\boldsymbol{\varDelta }^{{\ast}})$, for some matrices D ^∗ and $\boldsymbol{\varDelta }^{{\ast}}$.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bagnato, L., Minozzo, M. (2014). A Latent Variable Approach to Modelling Multivariate Geostatistical Skew-Normal Data. In: Carpita, M., Brentari, E., Qannari, E. (eds) Advances in Latent Variables. Studies in Theoretical and Applied Statistics(). Springer, Cham. https://doi.org/10.1007/10104_2014_14

Download citation

DOI: https://doi.org/10.1007/10104_2014_14
Published: 28 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02966-5
Online ISBN: 978-3-319-02967-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

A Latent Variable Approach to Modelling Multivariate Geostatistical Skew-Normal Data

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation