Advertisement

Bayesian nonstationary Gaussian process models via treed process convolutions

  • Waley W. J. Liang
  • Herbert K. H. Lee
Regular Article
  • 23 Downloads

Abstract

The Gaussian process is a common model in a wide variety of applications, such as environmental modeling, computer experiments, and geology. Two major challenges often arise: First, assuming that the process of interest is stationary over the entire domain often proves to be untenable. Second, the traditional Gaussian process model formulation is computationally inefficient for large datasets. In this paper, we propose a new Gaussian process model to tackle these problems based on the convolution of a smoothing kernel with a partitioned latent process. Nonstationarity can be modeled by allowing a separate latent process for each partition, which approximates a regional clustering structure. Partitioning follows a binary tree generating process similar to that of Classification and Regression Trees. A Bayesian approach is used to estimate the partitioning structure and model parameters simultaneously. Our motivating dataset consists of 11918 precipitation anomalies. Results show that our model has promising prediction performance and is computationally efficient for large datasets.

Keywords

Spatial statistics Stochastic modeling Classification and Regression Trees Reduced-rank approximation Heteroscedasticity 

Mathematics Subject Classification

60G15 60G60 62M30 62M20 62F15 

Notes

Acknowledgements

This research was partially supported by National Science Foundation Grant DMS-0906720.

References

  1. Analytics R, Weston S (2015a) doParallel: Foreach parallel adaptor for the “parallel” package. http://CRAN.R-project.org/package=doParallel, R package version 1.0.10
  2. Analytics R, Weston S (2015b) foreach: Provides Foreach looping construct for R. http://CRAN.R-project.org/package=foreach, R package version 1.4.3
  3. Banerjee S, Gelfand AE, Finley AO, Sang H (2008) Gaussian predictive process models for large spatial data sets. J R Stat Soc Ser B 70(4):825–848MathSciNetCrossRefGoogle Scholar
  4. Bornn L, Shaddick G, Zidek J (2012) Modelling nonstationary processes through dimension expansion. J Am Stat Assoc 107(497):281–289CrossRefGoogle Scholar
  5. Breiman L, Friedman JH, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth, BelmontzbMATHGoogle Scholar
  6. Brenning A (2001) Geostatistics without stationarity assumptions within geographical information systems. Freiberg Online Geosci 6:1–108Google Scholar
  7. Chipman HA, George EI, McCulloch RE (1998) Bayesian CART model search. J Am Stat Assoc 93(443):935–948CrossRefGoogle Scholar
  8. Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B 70(Part 1):209–226MathSciNetCrossRefGoogle Scholar
  9. Damian D, Sampson P, Guttorp P (2001) Bayesian estimation of semi-parametric non-stationary spatial covariance structure. Environmetrics 12:161–178CrossRefGoogle Scholar
  10. Finley AO, Banerjee S, Carlin BP (2007) spBayes: an R package for univariate and multivariate hierarchical point-referenced spatial models. J Stat Softw 19(4):1–24 http://www.jstatsoft.org/article/view/v019i04
  11. Finley AO, Sang H, Banerjee S, Gelfand AE (2009) Improving the performance of predictive process modeling for large datasets. Comput Stat Data Anal 53:2873–2884MathSciNetCrossRefGoogle Scholar
  12. Fuentes M, Smith RL (2001) A new class of nonstationary spatial models. Technical reports on North Carolina State University, Department of Statistics, Raleigh, NCGoogle Scholar
  13. Fuentes M, Kelly R, Kittel T, Nychka D (1998) Spatial prediction of climate fields for ecological models. Technical reports on National Center for Atmospheric Research, Boulder COGoogle Scholar
  14. Furrer R (2006) KriSp: an R package for covariance tapered kriging of large datasets using sparse matrix techniques. In: Technical reports on MCS 06-06, Colorado School of Mines, Golden, USA, http://user.math.uzh.ch/furrer/software/KriSp/, version 0.4, 2006–10–26
  15. Gaujoux R (2014) doRNG: generic reproducible parallel backend for “foreach” loops. http://CRAN.R-project.org/package=doRNG, R package version 1.6
  16. Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal Mach Intell 12:609–628CrossRefGoogle Scholar
  17. Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102:359–378MathSciNetCrossRefGoogle Scholar
  18. Gramacy RB (2007) tgp: an R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian process models. J Stat Softw 19(9):1–46. http://www.jstatsoft.org/v19/i09/
  19. Gramacy RB, Apley DW (2015) Local Gaussian process approximation for large computer experiments. J Comput Graph Stat 24(2):561–578MathSciNetCrossRefGoogle Scholar
  20. Gramacy RB, Lee HK (2008) Bayesian treed Gaussian process models with an application to computer modeling. J Am Stat Assoc 103(483):1119–1130MathSciNetCrossRefGoogle Scholar
  21. Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–32MathSciNetCrossRefGoogle Scholar
  22. Higdon D (1998) A process-convolution approach to modeling temperatures in the north Atlantic Ocean. J Environ Ecol Stat 5(2):173–190CrossRefGoogle Scholar
  23. Higdon D (2002) Space and space-time modeling using process convolutions. In: Anderson C, Barnett V, Chatwin P, El-Shaarawi A (eds) Quantitative methods for current environmental issues. Springer, London, pp 37–54CrossRefGoogle Scholar
  24. Higdon D (2006) A primer on space-time modeling from a Bayesian perspective. In: Finkenstadt B, Held L, Isham V (eds) Statistical methods of spatio-temporal systems. Chapman and Hall/CRC, Boca Raton, pp 217–279CrossRefGoogle Scholar
  25. Higdon D, Swall J, Kern J (1999) Non-stationary spatial modeling. Bayesian Stat 6:761–768zbMATHGoogle Scholar
  26. Johns CJ, Nychka D, Kittel TG, Daly C (2003) Infilling sparse records of spatial fields. J Am Stat Assoc 98:796–806MathSciNetCrossRefGoogle Scholar
  27. Katzfuss M (2013) Bayesian nonstationary spatial modeling for very large datasets. Environmetrics 24(3):189–200MathSciNetCrossRefGoogle Scholar
  28. Kim HM, Mallick BK, Holmes CC (2005) Analyzing nonstationary spatial data using piecewise Gaussian processes. J Am Stat Assoc 100:653–668MathSciNetCrossRefGoogle Scholar
  29. Konomi BA, Sang H, Mallick BK (2014) Adaptive Bayesian nonstationary modeling for large spatial datasets using covariance approximations. J Comput Graph Stat 23(3):802–829MathSciNetCrossRefGoogle Scholar
  30. Lee HKH, Higdon D, Calder CA, Holloman CH (2005) Efficient models for correlated data via convolutions of intrinsic processes. Stat Model 5(1):53–74MathSciNetCrossRefGoogle Scholar
  31. Lemos RT, Sansó B (2009) Spatio-temporal model for mean, anomaly and trend fields of north atlantic sea surface temperature. J Am Stat Assoc 104(485):5–18MathSciNetCrossRefGoogle Scholar
  32. Liang WWJ (2012) Bayesian nonstationary Gaussian process models via treed process convolutions. Ph.D. Thesis, Department of AMS, UCSC, Santa Cruz, 95064Google Scholar
  33. Montagna S (2013) On Bayesian analyses of functional regression, correlated functional data and non-homogeneous computer models. Ph.D. Thesis, Duke University, Durham, NC 27708Google Scholar
  34. Naish-Guzman A, Holden S (2007) The generalized FITC approximation. In: Advances in neural information processing systems, pp 1057–1064Google Scholar
  35. Paciorek C, Schervish MJ (2006) Spatial modelling using a new class of nonstationary covariance functions. Environmetrics 17:483–506MathSciNetCrossRefGoogle Scholar
  36. Sampson P, Guttorp P (1992) Nonparametric estimation of nonstationary spatial covariance structure. J Am Stat Assoc 87:108–119CrossRefGoogle Scholar
  37. Sang H, Huang JZ (2012) A full scale approximation of covariance functions for large spatial data sets. J R Stat Soc Ser B 74(22):111–132MathSciNetCrossRefGoogle Scholar
  38. Schmidt A, O’Hagan A (2003) Bayesian inference for non-stationary spatial covariance structure via spatial deformations. J R Stat Soc Ser B 65:743–758MathSciNetCrossRefGoogle Scholar
  39. Snelson E, Ghahramani Z (2005) Sparse Gaussian processes using pseudo-inputs. In: Advances in neural information processing systems, 18Google Scholar
  40. Taddy MA, Gramacy RB, Polson NG (2011) Dynamic trees for learning and design. J Am Stat Assoc 106(493):109–123MathSciNetCrossRefGoogle Scholar
  41. van Dyk DA, Park T (2008) Partially collapsed Gibbs samplers: theory and methods. J Am Stat Assoc 103(482):790–796MathSciNetCrossRefGoogle Scholar
  42. Yang H, Liu F, Ji C, Dunson D (2014) Adaptive sampling for Bayesian geospatial models. Stat Comput 24:1101–1110MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Applied Mathematics and StatisticsUniversity of CaliforniaSanta CruzUSA

Personalised recommendations