Skip to main content
Log in

Bayesian nonstationary Gaussian process models via treed process convolutions

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

The Gaussian process is a common model in a wide variety of applications, such as environmental modeling, computer experiments, and geology. Two major challenges often arise: First, assuming that the process of interest is stationary over the entire domain often proves to be untenable. Second, the traditional Gaussian process model formulation is computationally inefficient for large datasets. In this paper, we propose a new Gaussian process model to tackle these problems based on the convolution of a smoothing kernel with a partitioned latent process. Nonstationarity can be modeled by allowing a separate latent process for each partition, which approximates a regional clustering structure. Partitioning follows a binary tree generating process similar to that of Classification and Regression Trees. A Bayesian approach is used to estimate the partitioning structure and model parameters simultaneously. Our motivating dataset consists of 11918 precipitation anomalies. Results show that our model has promising prediction performance and is computationally efficient for large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Analytics R, Weston S (2015a) doParallel: Foreach parallel adaptor for the “parallel” package. http://CRAN.R-project.org/package=doParallel, R package version 1.0.10

  • Analytics R, Weston S (2015b) foreach: Provides Foreach looping construct for R. http://CRAN.R-project.org/package=foreach, R package version 1.4.3

  • Banerjee S, Gelfand AE, Finley AO, Sang H (2008) Gaussian predictive process models for large spatial data sets. J R Stat Soc Ser B 70(4):825–848

    Article  MathSciNet  MATH  Google Scholar 

  • Bornn L, Shaddick G, Zidek J (2012) Modelling nonstationary processes through dimension expansion. J Am Stat Assoc 107(497):281–289

    Article  MATH  Google Scholar 

  • Breiman L, Friedman JH, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth, Belmont

    MATH  Google Scholar 

  • Brenning A (2001) Geostatistics without stationarity assumptions within geographical information systems. Freiberg Online Geosci 6:1–108

    Google Scholar 

  • Chipman HA, George EI, McCulloch RE (1998) Bayesian CART model search. J Am Stat Assoc 93(443):935–948

    Article  Google Scholar 

  • Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B 70(Part 1):209–226

    Article  MathSciNet  MATH  Google Scholar 

  • Damian D, Sampson P, Guttorp P (2001) Bayesian estimation of semi-parametric non-stationary spatial covariance structure. Environmetrics 12:161–178

    Article  Google Scholar 

  • Finley AO, Banerjee S, Carlin BP (2007) spBayes: an R package for univariate and multivariate hierarchical point-referenced spatial models. J Stat Softw 19(4):1–24 http://www.jstatsoft.org/article/view/v019i04

  • Finley AO, Sang H, Banerjee S, Gelfand AE (2009) Improving the performance of predictive process modeling for large datasets. Comput Stat Data Anal 53:2873–2884

    Article  MathSciNet  MATH  Google Scholar 

  • Fuentes M, Smith RL (2001) A new class of nonstationary spatial models. Technical reports on North Carolina State University, Department of Statistics, Raleigh, NC

  • Fuentes M, Kelly R, Kittel T, Nychka D (1998) Spatial prediction of climate fields for ecological models. Technical reports on National Center for Atmospheric Research, Boulder CO

  • Furrer R (2006) KriSp: an R package for covariance tapered kriging of large datasets using sparse matrix techniques. In: Technical reports on MCS 06-06, Colorado School of Mines, Golden, USA, http://user.math.uzh.ch/furrer/software/KriSp/, version 0.4, 2006–10–26

  • Gaujoux R (2014) doRNG: generic reproducible parallel backend for “foreach” loops. http://CRAN.R-project.org/package=doRNG, R package version 1.6

  • Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal Mach Intell 12:609–628

    Article  MATH  Google Scholar 

  • Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102:359–378

    Article  MathSciNet  MATH  Google Scholar 

  • Gramacy RB (2007) tgp: an R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian process models. J Stat Softw 19(9):1–46. http://www.jstatsoft.org/v19/i09/

  • Gramacy RB, Apley DW (2015) Local Gaussian process approximation for large computer experiments. J Comput Graph Stat 24(2):561–578

    Article  MathSciNet  Google Scholar 

  • Gramacy RB, Lee HK (2008) Bayesian treed Gaussian process models with an application to computer modeling. J Am Stat Assoc 103(483):1119–1130

    Article  MathSciNet  MATH  Google Scholar 

  • Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–32

    Article  MathSciNet  MATH  Google Scholar 

  • Higdon D (1998) A process-convolution approach to modeling temperatures in the north Atlantic Ocean. J Environ Ecol Stat 5(2):173–190

    Article  Google Scholar 

  • Higdon D (2002) Space and space-time modeling using process convolutions. In: Anderson C, Barnett V, Chatwin P, El-Shaarawi A (eds) Quantitative methods for current environmental issues. Springer, London, pp 37–54

    Chapter  Google Scholar 

  • Higdon D (2006) A primer on space-time modeling from a Bayesian perspective. In: Finkenstadt B, Held L, Isham V (eds) Statistical methods of spatio-temporal systems. Chapman and Hall/CRC, Boca Raton, pp 217–279

    Chapter  Google Scholar 

  • Higdon D, Swall J, Kern J (1999) Non-stationary spatial modeling. Bayesian Stat 6:761–768

    MATH  Google Scholar 

  • Johns CJ, Nychka D, Kittel TG, Daly C (2003) Infilling sparse records of spatial fields. J Am Stat Assoc 98:796–806

    Article  MathSciNet  Google Scholar 

  • Katzfuss M (2013) Bayesian nonstationary spatial modeling for very large datasets. Environmetrics 24(3):189–200

    Article  MathSciNet  Google Scholar 

  • Kim HM, Mallick BK, Holmes CC (2005) Analyzing nonstationary spatial data using piecewise Gaussian processes. J Am Stat Assoc 100:653–668

    Article  MathSciNet  MATH  Google Scholar 

  • Konomi BA, Sang H, Mallick BK (2014) Adaptive Bayesian nonstationary modeling for large spatial datasets using covariance approximations. J Comput Graph Stat 23(3):802–829

    Article  MathSciNet  Google Scholar 

  • Lee HKH, Higdon D, Calder CA, Holloman CH (2005) Efficient models for correlated data via convolutions of intrinsic processes. Stat Model 5(1):53–74

    Article  MathSciNet  MATH  Google Scholar 

  • Lemos RT, Sansó B (2009) Spatio-temporal model for mean, anomaly and trend fields of north atlantic sea surface temperature. J Am Stat Assoc 104(485):5–18

    Article  MathSciNet  Google Scholar 

  • Liang WWJ (2012) Bayesian nonstationary Gaussian process models via treed process convolutions. Ph.D. Thesis, Department of AMS, UCSC, Santa Cruz, 95064

  • Montagna S (2013) On Bayesian analyses of functional regression, correlated functional data and non-homogeneous computer models. Ph.D. Thesis, Duke University, Durham, NC 27708

  • Naish-Guzman A, Holden S (2007) The generalized FITC approximation. In: Advances in neural information processing systems, pp 1057–1064

  • Paciorek C, Schervish MJ (2006) Spatial modelling using a new class of nonstationary covariance functions. Environmetrics 17:483–506

    Article  MathSciNet  Google Scholar 

  • Sampson P, Guttorp P (1992) Nonparametric estimation of nonstationary spatial covariance structure. J Am Stat Assoc 87:108–119

    Article  Google Scholar 

  • Sang H, Huang JZ (2012) A full scale approximation of covariance functions for large spatial data sets. J R Stat Soc Ser B 74(22):111–132

    Article  MathSciNet  MATH  Google Scholar 

  • Schmidt A, O’Hagan A (2003) Bayesian inference for non-stationary spatial covariance structure via spatial deformations. J R Stat Soc Ser B 65:743–758

    Article  MathSciNet  MATH  Google Scholar 

  • Snelson E, Ghahramani Z (2005) Sparse Gaussian processes using pseudo-inputs. In: Advances in neural information processing systems, 18

  • Taddy MA, Gramacy RB, Polson NG (2011) Dynamic trees for learning and design. J Am Stat Assoc 106(493):109–123

    Article  MathSciNet  MATH  Google Scholar 

  • van Dyk DA, Park T (2008) Partially collapsed Gibbs samplers: theory and methods. J Am Stat Assoc 103(482):790–796

    Article  MathSciNet  MATH  Google Scholar 

  • Yang H, Liu F, Ji C, Dunson D (2014) Adaptive sampling for Bayesian geospatial models. Stat Comput 24:1101–1110

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research was partially supported by National Science Foundation Grant DMS-0906720.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Waley W. J. Liang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, W.W.J., Lee, H.K.H. Bayesian nonstationary Gaussian process models via treed process convolutions. Adv Data Anal Classif 13, 797–818 (2019). https://doi.org/10.1007/s11634-018-0341-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-018-0341-2

Keywords

Mathematics Subject Classification

Navigation