Abstract
This paper presents an overview of results for the geostatistical analysis of collocated multivariate data sets, whose variables form a composition, where the components represent the relative importance of the parts forming a whole. Such data sets occur most often in mining, hydrogeochemistry and soil science, but the results gathered here are relevant for any regionalised compositional data set. The paper covers the basic definitions, the analysis of the spatial codependence between components, mapping methods of cokriging and cosimulation honoring compositional constraints, the role of pre- and post-transformations such as log-ratios or multivariate normal score transforms, and block-support upscaling. The main result is that multivariate geostatistical techniques can and should be performed on log-ratio scores, in which case the system data-variograms-cokriging/cosimulation is intrinsically consistent, delivering the same results regardless of which log-ratio transformation was used to represent them. Proofs of all statements are included in an appendix.
Similar content being viewed by others
References
Aitchison J (1982) The statistical analysis of compositional data (with discussion). J R Stat Soc Ser B (Stat Methodol) 44:139–177
Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman & Hall Ltd., London (Reprinted in 2003 with additional material by The Blackburn Press)
Aitchison J, Barceló-Vidal C, Martín-Fernández JA, Pawlowsky-Glahn V (2000) Logratio analysis and compositional distance. Math Geol 32(3):217–275
Angerer T, Hagemann S (2010) The BIF-hosted high-grade iron ore deposits in the Archean Koolyanobbing Greenstone Belt, Western Australia: structural control on synorogenic- and weathering-related magnetite-, hematite- and goethite-rich iron ore. Econ Geol 105(3):917–945
Barceló-Vidal C (2003) When a data set can be considered compositional? In: Thió-Henestrosa S, Martin-Fernández JA (eds) Proceedings of CoDaWork’03, The 1st Compositional Data Analysis Workshop. Universitat de Girona
Barceló-Vidal C, Martín-Fernández JA (2016) The mathematics of compositional analysis. Austrian J Stat 45(4):57–71
Barnett RM, Manchuk JG, Deutsch CV (2014) Projection pursuit multivariate transform. Math Geosci 46(2):337–360
Bivand RS, Pebesma E, Gomez-Rubio V (2013) Applied spatial data analysis with R. Springer, New York
Chayes F (1960) On correlation between variables of constant sum. J Geophys Res 65(12):4185–4193
Chilès JP, Delfiner P (1999) Geostatistics. Wiley, New York
Cressie N (1991) Statistics for spatial data. Wiley, New York
Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geosci 35(3):279–300
Filzmoser P, Hron K (2008) Outlier detection for compositional data using robust methods. Math Geosci 40(3):233–248
Geovariances (2017) Isatis geostatistical software. Avon, France
Griffin AC (1981) Structure and Iron Ore deposition in the Archaean Koolyanobbing Greenstone belt, Western Australia. Geol Soc Austra Spec Publ 7:429–438
Journel AG, Huijbregts CJ (1978) Mining geostatistics. Academic Press, London
Lark RM, Bishop TFA (2007) Cokriging particle size fractions of the soil. Eur J Soil Sci 58(3):763–774
Leuangthong O, Deutsch CV (2003) Stepwise conditional transformation for simulation of multiple variables. Math Geol 35(2):155–173
Mateu-Figueras G, Pawlowsky-Glahn V, Egozcue JJ (2011) The principle of working on coordinates. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, New York, pp 29–42
Mateu-Figueras G, Pawlowsky-Glahn V, Egozcue JJ (2013) The normal distribution in some constrained sample spaces. Stat Oper Res Trans 37(1):29–56
Matheron G (1963) Principles of geostatistics. Econ Geol 58:1246–1266
Matheron G (1965) Les variables régionalisées et leur estimation-une application de la théorie des fonctions aléatoires aux sciences de la nature. Masson et Cie, Paris
Matheron G (1971) The theory of regionalized variables and its applications. Technical Report C-5, École Nationale Supérieure des Mines de Paris, Centre de Geostatistique et de Morphologie Mathematique, Fontainebleau
Molayemat H, Torab FM, Pawlowsky-Glahn V, Hossein Morshedy A, Egozcue JJ (2018) The impact of the compositional nature of data on coal reserve evaluation, a case study in Parvadeh IV coal deposit, Central Iran. Int J Coal Geol 188:94–111. https://doi.org/10.1016/j.coal.2018.02.003
Morales Boezio MN (2010) Estudo das metodologias alternativas da geoestatística multivariada aplicadas a estimativa de teores de depósitos de ferro. Ph. D. thesis, Universidade Federal do Rio Grande do Sul
Morales Boezio MN, Costa JF, Koppe JC (2012) Cokrigagem de razōes logarítmicas aditivas (alr) na estimativa de teores em depós de ferro (Cokriging of additive log-ratios (alr) for grade estimation in iron ore deposits. Rev Escol Minas 65:401–412
Mueller UA, Grunsky EC (2016) Multivariate spatial analysis of lake sediment geochemical data; Melville Peninsula, Nunavut, Canada. Appl Geochem 75:247–262. https://doi.org/10.1016/j.apgeochem.2016.02.007
Mueller U, Tolosana-Delgado R, van den Boogaart KG (2014) Simulation of compositional data: a nickel-laterite case study. In: Dimitrakopoulos R (ed) Advances in orebody modelling and strategic mine planning. AusIMM, Melbourne
Myers DE (1982) Matrix formulation of co-kriging. Math Geol 14(3):49–257
Pawlowsky V (1984) On spurious spatial covariance between variables of constant sum. Sci Terre Sér Inform 21:107–113
Pawlowsky V (1986) Räumliche Strukturanalyse und Schätzung ortsabhängiger Kompositionen mit Anwendungsbeispielen aus der Geologie. Ph.D. thesis, Freie Universität Berlin
Pawlowsky V (1989) Cokriging of regionalized compositions. Math Geol 21(5):513–521
Pawlowsky-Glahn V (2003) Statistical modelling on coordinates. In: Thió-Henestrosa S, Martin-Fernández JA (eds) Proceedings of CoDaWork’03, The 1st Compositional Data Analysis Workshop. Universitat de Girona
Pawlowsky-Glahn V, Burger H (1992) Spatial structure analysis of regionalized compositions. Math Geol 24(6):675–691
Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk Assess 15(5):384–398
Pawlowsky-Glahn V, Egozcue JJ (2002) BLU estimators and compositional data. Math Geol 34(3):259–274
Pawlowsky-Glahn V, Egozcue JJ (2016) Spatial analysis of compositional data: a historical review. J Geochem Explor 164:28–32. https://doi.org/10.1016/j.gexplo.2015.12.010
Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R (2015) Modeling and analysis of compositional data. Wiley, Chichester
Pawlowsky-Glahn V, Olea RA (2004) Geostatistical analysis of compositional data. Studies in mathematical geology 7. Oxford University Press, Oxford
Pawlowsky-Glahn V, Olea RA, Davis JC (1995) Estimation of regionalized compositions: a comparison of three methods. Math Geol 27(1):105–127
Rossi ME, Deutsch CV (2014) Mineral resource estimation. Springer, New York
Sun XL, Wu YJ, Wang HL, Zhao YG, Zhang GL (2014) Mapping soil particle size fractions using compositional kriging, cokriging and additive log-ratio cokriging in two case studies. Math Geosci 46(4):429–443
Tjelmeland H, Lund KV (2003) Bayesian modelling of spatial compositional data. J Appl Stat 30(1):87–100
Tolosana-Delgado R (2006) Geostatistics for constrained variables: positive data, compositions and probabilities. Application to environmental hazard monitoring. Ph.D. thesis, Universitat de Girona
Tolosana-Delgado R, Egozcue JJ, Pawlowsky-Glahn V (2008) Cokriging of compositions: log ratios and unbiasedness. In: Ortiz JM, Emery X (eds) Geostatistics Chile 2008. Gecamin Ltd., Santiago, pp 299–308
Tolosana-Delgado R, Mueller U, van den Boogaart KG, Ward C (2013) Block cokriging of a whole composition. In: Costa JF, Koppe J, Peroni R (eds) Proceedings of the 36th APCOM international symposium on the applications of computers and operations research in the mineral industry. Fundacao Luiz Englert, Porto Alegre, pp 267–277
Tolosana-Delgado R, Mueller U, van den Boogaart KG, Ward C, Gutzmer J (2015) Improving processing by adaption to conditional geostatistical simulation of block compositions. J South Afr Inst Min Metall 115(1):13–26
Tolosana-Delgado R, Otero N, Pawlowsky-Glahn V (2005) Some basic concepts of compositional geometry. Math Geol 37(7):673–680
Tolosana-Delgado R, van den Boogaart KG (2013) Joint consistent mapping of high-dimensional geochemical surveys. Math Geosci 45(8):983–1004
Tolosana-Delgado R, van den Boogaart KG, Pawlowsky-Glahn V (2011) Geostatistics for compositions. In: Pawlowsky-Glahn V, Buccianti A (eds) Compositional data analysis: theory and applications. Wiley, New York, pp 73–86
van den Boogaart KG, Tolosana-Delgado R, Bren M (2018) Compositions: compositional data analysis package R package version 1.40-2
van den Boogaart KG, Tolosana-Delgado R (2013) Analysing compositional data with R. Springer, Heidelberg
van den Boogaart KG, Tolosana-Delgado R, Mueller U (2017) An affine equivariant multivariate normal score transform for compositional data. Math Geosci 49(2):231–252
Wackernagel H (2003) Multivariate geostatistics: an introduction with applications. Springer, Berlin
Walwoort DJ, de Gruijter JJ (2001) Compositional kriging: a spatial interpolation method for compositional data. Math Geol 33(8):951–966
Ward C, Mueller U (2012) Multivariate estimation using log ratios: a worked alternative. In: Abrahamsen P, Hauge R, Kolbjornsen O (eds) Geostatistics Oslo 2012. Springer, Berlin, pp 333–343
Ward C, Mueller U (2013) Compositions, log ratios and bias—from grade control to resource. In: Iron Ore 2013 shifting the paradigm. AUSIMM, Melbourne, pp 313–320
Acknowledgements
This paper was compiled during research visits between Perth and Freiberg within the project “CoDaBlockCoEstimation”, jointly funded by the German Academic Exchange Service (DAAD) and Universities Australia. The authors warmly thank Vera Pawlwowsky-Glahn and Juan José Egozcue for their constructive comments on a previous version of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Formal Statements and Proofs
In this appendix, proofs for the results in the main text are provided. They are organised into a “non-spatial” part and a spatial part. A few of these results have been published before, albeit mostly for \({\text {alr}}\), \({\text {clr}}\) or \({\text {ilr}}\) transformations. They are included for reasons of self-containedness and generality, as the expressions in this contribution are valid for any full rank log-ratio transformation. These pre-existing results are cited as appropriate. In what follows the variables \({\mathbf {z}}\) or \({\mathbf {Z}}\) will denote a composition in the original scale and \(\varvec{\zeta }\) or \(\varvec{Z}\) will be used for the log-ratio scores.
1.1 A.1 Non-spatial Results
Definition 1
(composition as a closed vector) A vector \({\mathbf {z}} \in \mathbb {R}^D\) is called a composition if its \(k^{th}\) component \(z_{k}\) represents the relative importance of part k with respect to the remaining components.
Typically, \(z_{k}\ge 0\) and \(z_{1}+z_{i}+\cdots +z_{D}=\kappa \), with \(\kappa =1\) (for proportions), \(\kappa =100\) (for percentages) and \(\kappa =10^6\) (for ppm). However, the variables under consideration might only represent a subset of all possible variables in which case the constant sum constraint is not necessarily satisfied. Subsequent treatment of the data then depends on whether or not the resulting non-constant sum is meaningful and less than \(\kappa \). In this case a fill-up variable (Eq. 2) can be added to retain that information and fulfill the constraint. On the other hand, if the non-constant sum is meaningless, the data can be reclosed (Eq. 3) without losing any information. Mathematically, this last case gives rise to the definition of compositions as equivalence classes (Barceló-Vidal 2003), the modern, more general definition of composition.
Definition 2
(log-ratio representation) A function \(\psi (\cdot )\) is a full-rank log-ratio representation of the composition \({\mathbf {z}}\) if its image \(\varvec{\zeta }\) satisfies
where \(\varvec{\varPsi }\) is a \((D-1) \times D\) matrix of rank \((D-1)\) with \(\varvec{\varPsi } \cdot {\mathbf {1}}_{D}={\mathbf {0}}_{D-1}\) (Barceló-Vidal and Martín-Fernández 2016).
Lemma 1
(inversion) If \(\psi (\cdot )\) is a full-rank log-ratio transformation, then the corresponding matrix \(\varvec{\varPsi }\) satisfies \(\varvec{\varPsi }^{-}\cdot \varvec{\varPsi }={\mathbf {H}}\), where \({\mathbf {H}}\) is the projection matrix on the orthogonal complement of the vector \({\mathbf {1}}_{D}\) in \(\mathbb {R}^D\) and \(\varvec{\varPsi }^{-}\) is its generalized inverse.
Proof
The singular value decomposition of \(\varvec{\varPsi }\) is given by \(\varvec{\varPsi } = {\mathbf {U}}\cdot {\mathbf {S}} \cdot {\mathbf {V}}^t\), where \({\mathbf {U}}\) is an orthogonal \((D-1) \times (D-1)\) matrix, \({\mathbf {V}}\) is an orthogonal \(D \times D\) matrix with \({\mathbf {V}}^t{\mathbf {V}}={\mathbf {I}}_{D}\) and \({\mathbf {S}}=\left[ \begin{matrix} {\mathbf {D}}_{(D-1)}&{\mathbf {0}}_{(D-1)} \end{matrix}\right] \) is a \((D-1) \times D\) matrix with \({\mathbf {D}}\) an invertible real diagonal matrix and \({\mathbf {0}}_{(D-1)}\) a column vector of zeros. The Moore-Penrose inverse is, therefore, \(\varvec{\varPsi }^{-}={\mathbf {V}}\cdot {\mathbf {S}}^{+}\cdot {\mathbf {U}}^t\) where \({\mathbf {S}}^{+}=[{\mathbf {D}}^{-1} \ {\mathbf {0}}_{(D-1)}]^t\). Then
Since \({\mathbf {S}}^{+}\cdot {\mathbf {S}}\) has rank \(D-1\) and \({\mathbf {V}}\) has full rank, \(\varvec{\varPsi }^{-} \cdot \varvec{\varPsi }\) has rank \(D-1\) and its eigenvalues are 1 and 0. Since \((\varvec{\varPsi }^{-} \cdot \varvec{\varPsi })^2=\varvec{\varPsi }^{-} \cdot \varvec{\varPsi }\), and \((\varvec{\varPsi }^{-} \cdot \varvec{\varPsi })^t=\varvec{\varPsi }^{-} \cdot \varvec{\varPsi }\), the matrix \(\varvec{\varPsi }^{-} \cdot \varvec{\varPsi }\) is an orthogonal projection. Moreover from the definition of \(\varvec{\varPsi }\) it follows that \(\varvec{\varPsi }^{-} \cdot \varvec{\varPsi }\cdot {\mathbf {1}}=\varvec{\varPsi }^{-} \cdot {\mathbf {0}}={\mathbf {0}}\). Therefore, if the columns of \({\mathbf {V}}\) are denoted by \( {\mathbf {v}}_{i}, i=1, \dots ,D\), the eigenvector for 0 is given by \({\mathbf {v}}_{D} = \frac{1}{\sqrt{D}}{\mathbf {1}}\), so that \((\varvec{\varPsi }^{-} \cdot \varvec{\varPsi }) = {\mathbf {I}}_{D}-{\mathbf {v}}_{D}{\mathbf {v}}_{D}^t = {\mathbf {I}}_D-\frac{1}{D}{\mathbf {1}}_{D\times D}={\mathbf {H}}\). \(\square \)
Proposition 1
(inverse log-ratio representation) A full-rank log-ratio representation \(\psi (\cdot )\) is one-to-one, and its inverse is
Proof
From the previous lemma if follows that
It remains to be shown that \(\psi (\cdot )\) is one-to-one when restricted to the orthogonal complement of \({\mathbf {1}}_{D}\), but this is a direct consequence of the definition of \(\psi (\cdot )\). \(\square \)
Proposition 2
(change of log-ratio representation) Let \({\mathbf {z}}\) be a composition, and \(\psi _1(\cdot )\) and \(\psi _2(\cdot )\) be two full-rank log-ratio transformations characterized by the matrices \(\varvec{\varPsi }_1\) and \(\varvec{\varPsi }_2\) respectively. Then, its two log-ratio representations \(\varvec{\zeta }_1=\psi _1({\mathbf {z}})\) and \(\varvec{\zeta }_2=\psi _2({\mathbf {z}})\) are related through the linear relationship
where the matrix \({\mathbf {A}}_{12}=\varvec{\varPsi }_2\cdot \varvec{\varPsi }_1^-\) is square and invertible.
Proof
From the preceding two propositions it follows that \(\varvec{\zeta }_2 = \psi _2({\mathbf {z}}) = \varvec{\varPsi }_2 \cdot \ln {\mathbf {z}}\) and \({\mathbf {z}} = {\mathcal {C}}[\exp ( \varvec{\varPsi }_1^{-} \cdot \varvec{\zeta }_1 )]\). Substituting the second expression into the first, one has
where \(\alpha =\ln ({\mathbf {1}}^t\cdot \exp (\varvec{\varPsi }_1^{-} \cdot \varvec{\zeta }_1))\). This last term satisfies \(\kappa \varvec{\varPsi }_2\cdot {\mathbf {1}}={\mathbf {0}}\), which delivers the final expression as sought. \(\square \)
Proposition 3
(log-ratio representation of the mean) Let \({\mathbf {Z}}=[z_{kn}]\), \(k=1,2,\ldots , D\), \(n=1,2, \ldots , N\), be a compositional data set with N observations and of D parts, and \(\psi (\cdot )\) be a full-rank log-ratio transformation. Then \(\hat{\text {E}}\left[ \psi ({\mathbf {Z}})\right] =\psi (\hat{ {\varvec{\mu }^{g}}})\) the log-ratio representation of the closed geometric mean (Eq. 13).
Proof
The empirical closed geometric center is \(\hat{{\mathbf {m}}} = {\mathcal {C}}[ \exp ( \ln ({\mathbf {Z}}) \cdot \varvec{1}_N/N ) ]\). The log-ratio mean is given by \(\hat{\text {E}}\left[ \psi ({\mathbf {Z}})\right] =(\varvec{\varPsi } \cdot \ln {\mathbf {Z}})\cdot \varvec{1}_N/N\). Substituting this expression into the definition of the inverse log-ratio representation results in
\(\square \)
This proposition also proves Eq. (15): Because the calculation of \(\hat{ {\varvec{\mu }^{g}}}\) does not involve any log-ratio representation, all log-ratio representations are equivalent. The idea of deriving statistics for compositional data from transformed scores is an application of the principle of working in coordinates (Mateu-Figueras et al. 2011).
Proposition 4
(log-ratio representations of the covariance) Let \({\mathbf {Z}}=[z_{kn}]\), \(k=1,2,\ldots , D\), \(n=1,2, \ldots ,N\), be a compositional data set with N observations and D parts, and \(\psi (\cdot )\) be a full-rank log-ratio transformation. Then the covariance matrix of the log-ratio representation can be obtained from the empirical variation matrix \(\hat{{\mathbf {T}}}\) as \(\hat{\varvec{\varSigma }}^\psi = -\frac{1}{2} \varvec{\varPsi }\cdot \hat{{\mathbf {T}}}\cdot \varvec{\varPsi }^t\).
Proof
From (Aitchison 1986) it is known that the clr covariance \(\hat{\varvec{\varSigma }}^c\) is related to the empirical variation matrix by \(\hat{\varvec{\varSigma }}^c = -\frac{1}{2}{\mathbf {H}} \cdot \hat{{\mathbf {T}}}\cdot {\mathbf {H}}\) and \(\varvec{\varPsi } \cdot {\mathbf {H}}=\varvec{\varPsi }\), which is a consequence the definition of the matrix \({\mathbf {H}}\) (Eq. 9), where
because the rows of \(\varvec{\varPsi }\) sum to zero. Therefore, it remains to be shown \(\hat{\varvec{\varSigma }}^\psi = \varvec{\varPsi }\cdot \hat{\varvec{\varSigma }}^c\cdot \varvec{\varPsi }^t\). The (maximum likelihood) estimators of these two covariance matrices are
Since \({\mathbf {H}}={\mathbf {H}}^t\), so that \({\mathbf {H}}\cdot \varvec{\varPsi }^t =\varvec{\varPsi }^t\), it follows that
Therefore \(\hat{\varvec{\varSigma }}^\psi = \varvec{\varPsi } \cdot \hat{\varvec{\varSigma }}^c \cdot \varvec{\varPsi }^t= -\frac{1}{2} \varvec{\varPsi } \cdot {\mathbf {H}} \cdot \hat{{\mathbf {T}}}\cdot {\mathbf {H}}\cdot \varvec{\varPsi }^t= -\frac{1}{2} \varvec{\varPsi } \cdot \hat{{\mathbf {T}}}\cdot \varvec{\varPsi }^t\). \(\square \)
It is straightforward to show that the same properties hold for unbiased estimators (with denominator \(N-1\)).
The preceding two propositions show that the empirical log-ratio mean vector and covariance matrix can be obtained directly from the empirical closed geometric center and the variation matrix. Equivalent relationships exist also between the theoretical counterparts of these statistics.
Corollary 1
If \(\psi (\cdot )\) is a full rank log-ratio transformation, then \(\varvec{\varPsi }^{-} \cdot \hat{\varvec{\varSigma }}^\varPsi \cdot \varvec{\varPsi }^{-t} = \hat{\varvec{\varSigma }}^c\).
Proof
From Proposition 4 it follows that \(\varvec{\varPsi }^{-} \cdot \hat{\varvec{\varSigma }}^\varPsi \cdot \varvec{\varPsi }^{-t}= \varvec{\varPsi }^{-} \cdot \varvec{\varPsi } \cdot \hat{\varvec{\varSigma }}^c \cdot \varvec{\varPsi }^t \cdot \varvec{\varPsi }^{-t}= {\mathbf {H}} \cdot \hat{\varvec{\varSigma }}^c\cdot {\mathbf {H}}^{t}=\hat{\varvec{\varSigma }}^c\). \(\square \)
Corollary 2
If \(\psi _{1}(\cdot )\) and \(\psi _{2}(\cdot )\) are full rank log-ratio transformations, then \(\hat{\varvec{\varSigma }}^{\varPsi _2}={\mathbf {A}}_{12}\cdot \hat{\varvec{\varSigma }}^{\varPsi _1} \cdot {\mathbf {A}}_{12}^t\), where \({\mathbf {A}}_{12}=\varvec{\varPsi }_{2} \cdot \varvec{\varPsi }_{1}^{-}\).
Proof
From \(\hat{\varvec{\varSigma }}^{\psi _2} = \varvec{\varPsi }_{2} \cdot \hat{\varvec{\varSigma }}^c \cdot \varvec{\varPsi }_{2}^t\) and Corollary 1 it follows that \(\hat{\varvec{\varSigma }}^{\psi _2}=\varvec{\varPsi }_{2} \cdot \varvec{\varPsi }_{1}^{-} \cdot \hat{\varvec{\varSigma }}^{\psi _1} \cdot \varvec{\varPsi }_{1}^{-t}\cdot \varvec{\varPsi }_{2}^t={\mathbf {A}}_{12} \cdot \hat{\varvec{\varSigma }}^{\psi _1}\cdot {\mathbf {A}}_{12}^t\). \(\square \)
Corollary 3
If \(\psi (\cdot )\) is a full rank log-ratio transformation, then \(({\hat{\varvec{\varSigma }}^c})^{-}=\varvec{\varPsi }^{t} \cdot (\hat{{\varvec{\varSigma }}}^\psi )^{-1} \cdot \varvec{\varPsi }\) is a generalised inverse of \(\hat{{\varvec{\varSigma }}}^c\).
Proof
Firstly, \({\hat{\varvec{\varSigma }}^\varPsi }\) has full rank and so is invertible, thus
since \((\varvec{\varPsi } \cdot \varvec{\varPsi }^{-})= {\mathbf {I}}_{(D-1)}\), so that \(\hat{\varvec{\varSigma }}^c\cdot {(\hat{\varvec{\varSigma }}^c)}^{-}\) is symmetric. Secondly, \({\hat{\varvec{\varSigma }}^c}\cdot ({\hat{\varvec{\varSigma }}^c)}^{-} \cdot {\hat{\varvec{\varSigma }}^c}={\mathbf {H}} \cdot {\hat{\varvec{\varSigma }}}^c=\hat{\varvec{\varSigma }}^c\) and \(({\hat{\varvec{\varSigma }}^c})^{-} \cdot \hat{\varvec{\varSigma }}^c \cdot ({\hat{\varvec{\varSigma }}^c})^{-} =({\hat{\varvec{\varSigma }}^c})^{-} \cdot {\mathbf {H}}=\varvec{\varPsi }^{t} \cdot ({\hat{\varvec{\varSigma }}^\psi })^{-1} \cdot \varvec{\varPsi }\cdot {\mathbf {H}} =({\hat{\varvec{\varSigma }}^c})^{-} \) Similarly, \(({\hat{\varvec{\varSigma }}^c})^{-} \cdot \hat{\varvec{\varSigma }}^c={\mathbf {H}}\). Therefore, \(\varvec{\varPsi }^{t} \cdot ({\hat{\varvec{\varSigma }}^\psi })^{-1} \cdot \varvec{\varPsi }\) satisfies all conditions of a generalised inverse. \(\square \)
Proposition 5
(invariance of the Mahalanobis distance) Let \({\mathbf {Z}}\) be a random composition, with variation matrix \({\mathbf {T}}\). The Aitchison–Mahalanobis distance between any two of its realisations \({\mathbf {z}}_1\) and \({\mathbf {z}}_2\)
is invariant under the choice of full-rank log-ratio representation \(\psi (\cdot )\).
Proof
To show this proposition, it suffices to observe that from Corollary 3 and the proof of Proposition 4 one obtains \({\mathbf {H}}\cdot {(\varvec{\varSigma }^c)}^{-}\cdot {\mathbf {H}}={\mathbf {H}}\cdot \varvec{\varPsi }^{t} \cdot {({\varvec{\varSigma }}^\varPsi )}^{-1} \cdot \varvec{\varPsi }\cdot {\mathbf {H}}=\varvec{\varPsi }^{t} \cdot {(\varvec{\varSigma }^\varPsi )}^{-1} \cdot \varvec{\varPsi }\) so that
an expression, which, given that \(\varvec{\varPsi }^{-} \cdot \varvec{\varPsi }={\mathbf {H}}\), does not depend on the log-ratio representation at all.\(\square \)
Filzmoser and Hron (2008) proved a more restricted version of this proposition, valid for the set of \({\text {clr}}\), \({\text {alr}}\) and \({\text {ilr}}\) log-ratio transformations. Proposition 6 is a direct consequence of the invariance property of the Mahalanobis distance.
Proposition 6
(invariance of the normal distribution) The probability density function of the normal distribution on the simplex with center \({\mathbf {m}}\) and variation matrix \({\mathbf {T}}\),
does not depend on the choice of full-rank log-ratio representation \(\psi (\cdot )\).
Analogous results are available for the case when the log-ratio transformation is not full-rank. In that case the determinant \(|\varvec{\varSigma }^\psi |\) needs to be generalised to the product of its non-zero eigenvalues. This invariance (Mateu-Figueras et al. 2013) is a direct consequence of the preceding Proposition 5 and the fact that the determinant of a matrix is one of its invariants.
1.2 A.2 Spatial Results
Definition 3
(compositional random function) A vector-valued random function \({\mathbf {Z}}=[Z_1,Z_2,\ldots , Z_D]\) on a spatial domain \(\mathcal {D} \subset \mathbb {R}^p\), is called compositional if for each \(x\in \mathcal {D}\) the vector of random variables \({\mathbf {Z}}(x)=[Z_1(x),Z_2(x),\ldots , Z_D(x)]\) shows the relative importance of a set of parts forming a total of interest.
Definition 4
(regionalized composition) Given a set of locations \(\{x_1, x_2, \ldots , x_N\}\), a regionalized data set \(\{ {\mathbf {z}}_1, {\mathbf {z}}_2, \ldots , {\mathbf {z}}_N \}\) with \({\mathbf {z}}_i={\mathbf {z}}(x_i)=[z_{1}(x_i), \ldots z_{D}(x_i)]=[z_{1i}, \dots , z_{Di}]\), \(i=1,2, \ldots ,N\) is called a regionalized composition, if \(z_{ki}\) represents the relative importance of part k with respect to the set of components considered at location \(x_i\).
Proposition 7
(log-ratio representation of the spatial structure) Let \({\mathbf {Z}}=[z_{ki}]=[z_k(x_i)]\), \(k=1,2,\ldots , D\), \(i=1,2, \ldots ,N\), be a regionalized compositional data set with N locations \(x_i\) and D parts, and \(\psi (\cdot )\) be a full-rank log-ratio transformation. Then, for each lag h, the variogram of the log-ratio representation can be obtained from the empirical variation-variogram \(\hat{{\mathbf {T}}}(h)\) as \(\hat{\varvec{\varSigma }}^\psi (h) = -\frac{1}{2} \varvec{\varPsi }\cdot \hat{{\mathbf {T}}}(h)\cdot \varvec{\varPsi }^t\), or from the clr-variogram matrix as \(\hat{\varvec{\varGamma }}^\psi (h) = \varvec{\varPsi }\cdot \hat{\varvec{\varGamma }}^c(h) \cdot \varvec{\varPsi }^t\).
This is a direct consequence of Propositions 3 and 4.
Proposition 8
(equivalence of the spatial structure) Let \({\mathbf {Z}}=[z_{kn}]=[z_k(x_n)]\), \(k=1,2,\ldots , D\), \(n=1,2, \ldots ,N\), be a regionalized compositional data set with N locations \(x_n\) and D parts, and let \(\psi _1(\cdot )\) and \(\psi _2(\cdot )\) be two full-rank log-ratio transformations. Then, for each lag h, the empirical variograms \(\hat{\varvec{\varGamma }}^{\psi _1}(h)\) and \(\hat{\varvec{\varGamma }}^{\psi _2}(h)\) are related through the linear relationship
with matrix \({\mathbf {A}}_{12}=\varvec{\varPsi }_2\cdot \varvec{\varPsi }_1^-\) square and invertible.
Proof
From Proposition 7 it follows that \(\hat{\varvec{\varGamma }}^{\psi _2}(h)= \varvec{\varPsi }_2 \cdot \hat{\varvec{\varGamma }}^c(h) \cdot \varvec{\varPsi }_2^t\); and because of Eq. (20), \(\hat{\varvec{\varGamma }}^c(h)= \varvec{\varPsi }_1^{-} \cdot \hat{\varvec{\varGamma }}^{\psi _1}(h) \cdot \varvec{\varPsi }_1^{-t}\). Therefore
which proves the desired equality because \({\mathbf {A}}_{12}^t=\varvec{\varPsi }_1^{-t}\cdot \varvec{\varPsi }_2^t\). \(\square \)
Since Proposition 8 holds for all lags, it is normal to require that any fitted model satisfies the same relation. This is automatically satisfied if a linear model of coregionalization \({\mathbf {T}}(h|\varvec{\theta })\) is fitted to the variation-variograms and then recast to each of the two log-ratio representations via Proposition 7.
Proposition 9
(invariance of the cokriging predictor and errors) Let \({\mathbf {Z}}=[z_{kn}]=[z_k(x_n)]\), \(k=1,2,\ldots , D\), \(n=1,2, \ldots ,N\), be a regionalized compositional data set with N locations \(x_n\) and D parts, and \(\psi _1(\cdot )\) and \(\psi _2(\cdot )\) be two full-rank log-ratio transformations. Then, the corresponding cokriging predictors \(\hat{\varvec{\zeta }}_{1}(x_0)\) and \(\hat{\varvec{\zeta }}_{2}(x_0)\) of the log-ratio transformed composition \( \varvec{\zeta }_i(x_0) = \psi _i({\mathbf {Z}}(x_0)) \) satisfy
so that
gives a predicted composition independent of the log-ratio representation used in the computations. Moreover, the corresponding cokriging error covariance matrices \({\mathbf {S}}_1\) and \({\mathbf {S}}_2\) are related by
with \({\mathbf {A}}_{12}=\varvec{\varPsi }_2\cdot \varvec{\varPsi }_1^-\), for all forms of cokriging (simple, ordinary, universal and cokriging with a trend) at all locations \(x_0\), if both are derived from the same linear model of coregionalization \({\mathbf {T}}(h|\varvec{\theta })\).
Proof
The case of simple cokriging (SK) under the assumption of second-order stationarity will be considered first. In both log-ratio representations, the SK predictor is of the form
where \(\varvec{Z}=[\varvec{\zeta }(x_1); \varvec{\zeta }(x_2);\ldots ; \varvec{\zeta }(x_N)]\) is the concatenated vector of all log-ratio transformed observations \(\varvec{\zeta }(x_n)=\varvec{\varPsi }\ln {\mathbf {z}}(x_n)\), and \(\varvec{\varLambda }=[\varvec{\lambda }_{1};\varvec{\lambda }_{2};\ldots ;\varvec{\lambda }_{N} ]\) is the block matrix of all cokriging weight matrices, which are obtained as (Myers 1982)
where each block \(\varvec{\varGamma }_{nm}=\varvec{\varGamma }(h|\varvec{\theta })=-\frac{1}{2} \varvec{\varPsi }{\mathbf {T}}(h|\varvec{\theta }) \varvec{\varPsi }^t\) using the fitted model \({\mathbf {T}}(h|\varvec{\theta })\). With the same notation, the SK error covariance is given by
Considering these matrices obtained with the two distinct log-ratio representations, and taking Eq. (33) into account, then
where \({\mathbf {A}}={\text {diag}}({\mathbf {A}}_{12}, {\mathbf {A}}_{12}, \ldots , {\mathbf {A}}_{12})\) and similarly
Now substituting Eqs. (35) and (36) into the expression for the weights
which implies that the cokriging weight matrices of each datum satisfy
due to the block-diagonal structure of \({\mathbf {A}}\). Finally, substituting these weights into the SK predictor of the second log-ratio representation, and taking into account Eq. (32) between the data, one obtains
thus establishing the identity between the cokriging predictors. To derive the relation for the cokriging error covariance, the same strategy can be used to express the error in terms of the second log-ratio representation as a function of that in terms of the first representation,
which proves the desired equivalence.
For the remaining cases of cokriging (which will be grouped under the name of universal cokriging, UK), the log-ratio mean is assumed to have the form
with the typical cases \(L=1\) and \(g_1(x)\equiv 1\) (for ordinary cokriging), \(g_l(x)=x^{l-1}\) up to the desired order L (universal cokriging), or \(L=1\) and \(g_1(x)\) an arbitrary function available everywhere in the estimation domain (for cokriging with a trend). In any case, the UK predictor has the same form (Eq. 34), where the weights are obtained from the solution of the system
subject to the L unbiasedness conditions
where \({\mathbf {I}}_{D-1}\) is the identity matrix of size \((D-1)\), the dimension of the composition. It is known (Myers 1982; Tolosana-Delgado 2006) that this is equivalent to solving an extended system of equations
where
with \({\mathbf {N}}^t=[\varvec{\nu }_1;\varvec{\nu }_2; \ldots ; \varvec{\nu }_L]\) the Lagrange multipliers for each unbiasedness condition, and \({\mathbf {G}}^t=[{\mathbf {G}}^t_1;{\mathbf {G}}^t_2; \ldots ; {\mathbf {G}}^t_N]\) with
The UK error covariance matrix is then shown to be
Since the UK and SK system of equations, predictors and errors have analogous forms, the proposition for the case of UK can be proved by showing that, if the extended matrices satisfy Eqs. (35)–(37), then they satisfy the UK system of equations (Eq. 38) as well. That is, if \({\mathbf {W}}_e^{(2)}={\mathbf {A}} {\mathbf {W}}_e^{(1)} {\mathbf {A}}^t\) (Eq. 35) and \(\varvec{\varLambda }_e^{(2)}={\mathbf {A}}^{-t}\varvec{\varLambda }_e^{(1)}{\mathbf {A}}_{12}^t\) (Eq. 37), then in Eq. (38), becomes
which holds given Eq. (36).\(\square \)
Lastly, the relationship is established between the quadratures for distinct log-ratio representations \(\psi _1(\cdot )\) and \(\psi _2(\cdot )\). The weights \(w_1, w_2, \ldots , w_k\) and quadrature points \(u_1, u_2, \ldots , u_k\) do not depend on the choice of log-ratio representation. If \(\hat{\varvec{\zeta }}_{i}\) is the predictor using the i-th log-ratio representation, then by Proposition 9 the representations \(\hat{\varvec{\zeta }}_{1}\) and \(\hat{\varvec{\zeta }}_{2}\) are related by \({\mathbf {A}}_{12}\cdot \hat{\varvec{\zeta }}_{1}=\hat{\varvec{\zeta }}_{2}\).
The spectral decomposition of the cokriging error covariance matrix \({\mathbf {S}}_1^K\) is given by \({\mathbf {S}}_1^K = {\mathbf {V}}_1\cdot {\mathbf {D}}_1\cdot {\mathbf {V}}_1^t\), where \({\mathbf {D}}_1\) is a diagonal matrix and \({\mathbf {V}}_1\) is an orthogonal matrix of eigenvectors then \({\mathbf {R}}_1 = {\mathbf {V}}_1\cdot {\mathbf {D}}_1^{1/2} \cdot {\mathbf {V}}_1^t\) is a square root of \({\mathbf {S}}_1^K\) and so from the congruence one has
This expression can be rewritten as
and so
is a square root of \({\mathbf {S}}_2\) if and only if \({\mathbf {A}}_{12}^t = {\mathbf {A}}_{12}^{-1}\), that is, \({\mathbf {A}}_{12}\) is an orthogonal matrix. In that case the quadrature vectors \(\varvec{\zeta }(i_1, i_2, \ldots , i_{D-1})\) are related by
where \({\mathbf {v}}_{[i_1, i_2, \ldots , i_{D-1}]}={\mathbf {A}}_{12} \cdot {\mathbf {u}}_{[i_1, i_2, \ldots , i_{D-1}]}\). Thus Gauss–Hermite quadratures are invariant under the choice of \({\text {ilr}}\) transformation only, but they are not affine equivariant.
Appendix B: Compositional Geostatistics Workflow
1.1 B.1 Interpolation
-
1.
Perform both classical and compositional exploratory analysis (Sect. 3.4)
-
2.
Compute variation-variograms of the regionalized composition (Eq. 22)
-
3.
Fit a valid model (Sect. 5.2); models such as the linear model of coregionalization or the minimum/maximum autocorrelation factors are useful
-
4.
Recast both the experimental and the model variation-variograms via other log-ratio transformation with respectively Eqs. (23) and (25), in order to confirm that the model fits the data reasonably well in these other reference systems with respect to these other log-ration representations
-
5.
Choose one of these alternative log-ratio transforms, and compute the scores of the data (Eq. 10)
-
6.
Apply cokriging to the log-ratio scores with variogram model expressed in the same log-ratios on a suitably chosen grid; store cokriging covariance error matrices if cross-validation or Gauss–Hermite quadratures is desired
-
7.
Backtransform the predicted values
-
8.
If unbiased estimates of the mass of each component are required and an ilr is being used, estimate them through Gauss-Hermite quadratures (Eq. 28); otherwise, follow the procedure in the Sect. B.2
-
9.
Further products (maps, cross-validation, block models, etc) can be derived from individual components of the composition or from relevant log-ratios; cross-validation studies should focus on multivariate quantities and pairwise log-ratio plots (Sect. 6.2).
Steps (2) and (3) can alternatively be applied to data transform via a particular log-ratio transformation. In this case, step (4) should also explore the fit of the model to the variation-variograms, and step (5) can be applied to the same log-ratio set as in step (2). This is the strategy followed in the paper, where all calculations were primarily done with the alr-transformed data.
1.2 B.2 Simulation
-
1.
Apply a log-ratio transformation to the data, then transform the scores via multivariate Gaussian anamorphosis, such as the flow anamorphosis (Sect. 7.2)
-
2.
Estimate direct and cross-variograms of the Gaussian scores
-
3.
Fit a valid joint model to these variograms
-
4.
Apply conditional simulation algorithms to produce simulations of the Gaussian scores
-
5.
Transform the simulated Gaussian scores to log-ratio scores with the inverse Gaussian anamorphosis, then backtransform the log-ratio scores to compositions
-
6.
Post-process simulations as desired, that is, produce point-wise estimates of non-linear quantities (Eq. 27), upscale them to block averages (Eqs. 29–31) or produce maps.
Rights and permissions
About this article
Cite this article
Tolosana-Delgado, R., Mueller, U. & van den Boogaart, K.G. Geostatistics for Compositional Data: An Overview. Math Geosci 51, 485–526 (2019). https://doi.org/10.1007/s11004-018-9769-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11004-018-9769-3