Skip to main content
Log in

Development and Evaluation of Geostatistical Methods for Non-Euclidean-Based Spatial Covariance Matrices

  • Published:
Mathematical Geosciences Aims and scope Submit manuscript

A Correction to this article was published on 08 May 2019

This article has been updated

Abstract

Customary and routine practice of geostatistical modeling assumes that inter-point distances are a Euclidean metric (i.e., as the crow flies) when characterizing spatial variation. There are many real-world settings, however, in which the use of a non-Euclidean distance is more appropriate, for example, in complex bodies of water. However, if such a distance is used with current semivariogram functions, the resulting spatial covariance matrices are no longer guaranteed to be positive-definite. Previous attempts to address this issue for geostatistical prediction (i.e., kriging) models transform the non-Euclidean space into a Euclidean metric, such as through multi-dimensional scaling (MDS). However, these attempts estimate spatial covariances only after distances are scaled. An alternative method is proposed to re-estimate a spatial covariance structure originally based on a non-Euclidean distance metric to ensure validity. This method is compared to the standard use of Euclidean distance, as well as a previously utilized MDS method. All methods are evaluated using cross-validation assessments on both simulated and real-world experiments. Results show a high level of bias in prediction variance for the previously developed MDS method that has not been highlighted previously. Conversely, the proposed method offers a preferred tradeoff between prediction accuracy and prediction variance and at times outperforms the existing methods for both sets of metrics. Overall results indicate that this proposed method can provide improved geostatistical predictions while ensuring valid results when the use of non-Euclidean distances is warranted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Change history

  • 08 May 2019

    The original version of this article unfortunately contained a mistake in equation 9.

References

  • Berman JD, Breysse PN, White RH, Waugh DW, Curriero FC (2015) Evaluating methods for spatial mapping: applications for estimating ozone concentrations across the contiguous United States. Environ Technol Innov 3:1–10

    Article  Google Scholar 

  • Bivand R, Keitt T, Rowlingson B (2016) rgdal: bindings for the geospatial data abstraction library, R package version 1.1-10 edn.

  • Boisvert JB (2010) Geostatistics with locally varying anisotropy. University of Alberta, Edmonton

    Google Scholar 

  • Boisvert JB, Deutsch CV (2011) Programs for kriging and sequential Gaussian simulation with locally varying anisotropy using non-Euclidean distances. Comput Geosci 37:495–510

    Article  Google Scholar 

  • Cheng SH, Higham NJ (1998) A modified Cholesky algorithm based on a symmetric indefinite factorization. SIAM J Matrix Anal Appl 19:1097–1110

    Article  Google Scholar 

  • Chesapeake Bay Program (2017) Data hub: CBP GIS datasets. Chesapeake Bay Program. https://www.chesapeakebay.net/what/data. Accessed 17 Sept 2015

  • Congdon CD, Martin JD (2007) On using standard residuals as a metric of kriging model quality. In: Proceedings of the 48th AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics, and materials conference, Honolulu HI

  • Core Team R (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

    Google Scholar 

  • Cressie NAC (1993) Statistics for spatial data, Revised edn. Wiley, London

    Google Scholar 

  • Curriero FC (2006) On the use of non-euclidean distance measures in geostatistics. Math Geol 38:907–926

    Article  Google Scholar 

  • Datta A, Banerjee S, Finley AO, Gelfand AE (2016) On nearest-neighbor Gaussian process models for massive spatial data. Wiley Interdiscip Rev Comput Stat 8:162–171

    Article  Google Scholar 

  • Davis BJ, Jacobs JM, Davis MF, Schwab KJ, DePaola A, Curriero FC (2017) Environmental determinants of Vibrio parahaemolyticus in the Chesapeake Bay. Appl Environ Microbiol 83:e01117–e01147

    Article  Google Scholar 

  • Del Castillo E, Colosimo BM, Tajbakhsh SD (2015) Geodesic gaussian processes for the parametric reconstruction of a free-form surface. Technometrics 57:87–99. https://doi.org/10.1080/00401706.2013.879075

    Article  Google Scholar 

  • Diggle PJ, Ribeiro PJ (2007) Model-based geostatistics. Springer series in statistics. Springer, New York

    Google Scholar 

  • ESRI (2011) ArcGIS desktop: release 10.3. Environmental Systems Research Institute, Redlands

    Google Scholar 

  • ESRI (2016) Cross Validation. esri. http://desktop.arcgis.com/en/arcmap/10.3/tools/geostatistical-analyst-toolbox/cross-validation.htm. Accessed 27 June 2016

  • Etten JV (2015) gdistance: distances and routes on geographical grids, R package version 1.1-9 edn.

  • Gardner B, Sullivan PJ, Lembo AJ Jr (2003) Predicting stream temperatures: Geostatistical model comparison using alternative distance metrics. Can J Fish Aquat Sci 60:344–351

    Article  Google Scholar 

  • Hengl T, Heuvelink GB, Stein A (2004) A generic framework for spatial prediction of soil variables based on regression-kriging. Geoderma 120:75–93

    Article  Google Scholar 

  • Henshaw SL, Curriero FC, Shields TM, Glass GE, Strickland PT, Breysse PN (2004) Geostatistics and GIS: tools for characterizing environmental contamination. J Med Syst 28:335–348

    Article  Google Scholar 

  • Higham NJ (2002) Computing the nearest correlation matrix: a problem from finance. IMA J Numer Anal 22:329–343

    Article  Google Scholar 

  • Jeffrey SJ, Carter JO, Moodie KB, Beswick AR (2001) Using spatial interpolation to construct a comprehensive archive of Australian climate data. Environ Model Softw 16:309–330

    Article  Google Scholar 

  • Jensen OP, Christman MC, Miller TJ (2006) Landscape-based geostatistics: a case study of the distribution of blue crab in Chesapeake Bay. Environmetrics 17:605–621

    Article  Google Scholar 

  • Kane MJ, Emerson J, Weston S (2013) Scalable strategies for computing with massive data. J Stat Softw 55:1–19

    Article  Google Scholar 

  • Laaha G, Skøien J, Blöschl G (2014) Spatial prediction on river networks: comparison of top-kriging with regional regression. Hydrol Process 28:315–324

    Article  Google Scholar 

  • Little LS, Edwards D, Porter DE (1997) Kriging in estuaries: as the crow flies, or as the fish swims? J Exp Mar Biol Ecol 213:1–11

    Article  Google Scholar 

  • Liu R, Young MT, Chen J-C, Kaufman JD, Chen H (2016) Ambient air pollution exposures and risk of Parkinson disease. Environ Health Perspect 124:1759

    Article  Google Scholar 

  • Løland A, Host G (2003) Spatial covariance modelling in a complex coastal domain by multidimensional scaling. Environmetrics 14:307–321. https://doi.org/10.1002/env.588

    Article  Google Scholar 

  • Lu B, Charlton M, Fotheringham AS (2011) Geographically Weighted Regression using a non-Euclidean distance metric with a study on London house price data. In: Procedia environmental sciences, pp 92-97. https://doi.org/10.1016/j.proenv.2011.07.017

  • Lu B, Charlton M, Harris P, Fotheringham AS (2014) Geographically weighted regression with a non-Euclidean distance metric: a case study using hedonic house price data. Int J Geogr Inf Sci 28:660–681. https://doi.org/10.1080/13658816.2013.865739

    Article  Google Scholar 

  • Lucas C (2001) Computing nearest covariance and correlation matrices. M.S, Thesis, University of Manchester

  • Maechler M (2016) sfsmisc: utilities from “Seminar fuer Statistik” ETH Zurich, R package version 1.1-0 edn.

  • Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, London

    Google Scholar 

  • Matheron G (1971) The theory of regionalized variables and its applications. Les Cah Morphol Math 5:218

    Google Scholar 

  • Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F (2015) e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien, R package version 1.6-7. edn.

  • Murphy R, Perlman E, Ball WP, Curriero FC (2015) Water-distance-based Kriging in Chesapeake Bay. J Hydrol Eng 20:0501403

    Article  Google Scholar 

  • Novomestky F (2012) matrixcalc: collection of functions for matrix calculations, R package version 1.0-3 edn.

  • Rathbun SL (1998) Spatial modelling in irregularly shaped regions: Kriging estuaries. Environmetrics 9:109–129

    Article  Google Scholar 

  • Ribeiro PJ, Diggle PJ (2016) geoR: analysis of geostatistical data, R package version 1.7-5.2 edn.

  • Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326. https://doi.org/10.1126/science.290.5500.2323

    Article  Google Scholar 

  • Rowlingson B, Diggle P (2015) splancs: spatial and space-time point pattern analysis, R package version 2.01-38 edn.

  • Sampson PD, Guttorp P (1992) Nonparametric estimation of nonstationary spatial covariance structure. J Am Stat Assoc 87:108–119

    Article  Google Scholar 

  • Schlather M, Malinowski A, Menck PJ, Oesting M, Strokorb K (2015) Analysis, simulation and prediction of multivariate random fields with package RandomFields. J Stat Softw 63:1–25

    Article  Google Scholar 

  • USGS (2016) The national hydrography dataset. https://nhd.usgs.gov/index.html. Accessed 3 Dec 2016

  • Ver Hoef JM (2018) Kriging models for linear networks and non-Euclidean distances: cautions and solutions. Methods Ecol Evol. https://doi.org/10.1111/2041-210x.12979

    Google Scholar 

  • Wickham H (2009) ggplot2: elegant graphics for data analysis. Springer, New York

    Book  Google Scholar 

  • Yu H, Wang X, Qing J, Nie H (2015) ArcMap raster edit suite (ARES), 0.2.1 edn. https://github.com/haoliangyu/ares

Download references

Acknowledgements

This work was supported by the National Institutes of Allergy and Infectious Diseases [Grant No. 1R01AI123931-01A1 to F.C.C. (principal investigator)]. Additional support for B.J.K.D. was provided in part by the Johns Hopkins’ Environment, Energy, Sustainability & Health Institute Fellowship and the Center for a Livable Future-Lerner Fellowship, as well as The National Science Foundation’s Water, Climate, and Health Integrative Education and Research traineeship (Grant No. 1069213). The authors would like to thank Tim Shields for helping to develop the schematic maps displayed in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Benjamin J. K. Davis.

Appendix A: List of Equations for Cross-Validation Metrics

Appendix A: List of Equations for Cross-Validation Metrics

Equations for the cross-validation metrics are provided below

$$ {\text{CV-}}R^{2} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left({y_{i} - \widehat{y}_{i} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} \left({y_{i} - \overline{y} } \right)^{2} }} ;\quad {\text{where }}\overline{y} = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} y_{i} , $$
(9)
$$ {\text{ME }} = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} y_{i} - \widehat{y}_{i} , $$
(10)
$$ {\text{RMSE }} = \sqrt {\frac{1}{n} \mathop \sum \limits_{i = 1}^{n} \left({y_{i} - \widehat{y}_{i} } \right)^{2} } , $$
(11)
$$ {\text{MPSE }} = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} \widehat{\sigma }_{i} , $$
(12)
$$ {\text{RMSSE }} = \sqrt {\frac{1}{n} \mathop \sum \limits_{i = 1}^{n} \left({\frac{{y_{i} - \widehat{y}_{i} }}{{\widehat{\sigma }_{i} }}} \right)^{2} ,} $$
(13)
$$ {\text{PI95 }} = \frac{1}{n} \mathop \sum \limits_{i = 1}^{n} I_{i} ;\quad {\text{where }}I_{i} = \left\{ {\begin{array}{*{20}l} 1 &\quad {{\text{if}}\quad \widehat{y}_{i} - 1.96\widehat{\sigma }_{i} \le y_{i} \le \widehat{y}_{i} + 1.96\widehat{\sigma }_{i} } \\ 0 &\quad {{\text{if }} \quad y_{i} < {\widehat{y}_{i} - 1.96\widehat{\sigma }_{i} \quad{\text{or }}y_{i} }> \widehat{y}_{i} + 1.96\widehat{\sigma }_{i} } \\ \end{array} ,} \right. $$
(14)

where \( y_{i} \) is an observed outcome for location i, \( \widehat{y}_{i} \) is the predicted (i.e., kriged) value, and \( \widehat{\sigma }_{i} \) is the kriging standard error.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Davis, B.J.K., Curriero, F.C. Development and Evaluation of Geostatistical Methods for Non-Euclidean-Based Spatial Covariance Matrices. Math Geosci 51, 767–791 (2019). https://doi.org/10.1007/s11004-019-09791-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11004-019-09791-y

Keywords

Navigation