A scalable Bayesian nonparametric model for large spatio-temporal data
- 13 Downloads
The Bayesian nonparametric (BNP) approach is an effective tool for building flexible spatio-temporal probability models. Despite the flexibility and attractiveness of this approach, the resulting spatio-temporal models become computationally demanding when datasets are large. This paper develops a class of computationally efficient and easy to implement BNP models for large spatio-temporal data. To be more specific, we introduce a random distribution for the spatio-temporal effects based on a stick-breaking construction in which the atoms are modeled in terms of a basis system. In this framework, a low rank basis approximation and a vector autoregressive process are used to model spatial and temporal dependencies, respectively. We demonstrate that the proposed model is an extension of the Gaussian low rank model with similar computational complexity, hence it offers great scalability for large spatio-temporal data. Through a simulation study, we assess the performance of the proposed model. For illustration, we then analyze a set of data comprised of precipitation measurements.
KeywordsLarge datasets Stick-breaking process Non-stationarity Non-Gaussianity
The Editor, and two referees are gratefully acknowledged. Their precise comments and constructive suggestions have substantially improved the manuscript.
- Bradley JR, Cressie N, Shi T (2011) Selection of rank and basis functions in the spatial random effects model. In: Proceedings of the 2011 joint statistical meetings. American Statistical Association, Alexandria, pp 3393–3406Google Scholar
- Finley AO, Banerjee S, Gelfand AE (2012) Bayesian dynamic modeling for large space–time datasets using Gaussian predictive processes, vol 14. Springer, BerlinGoogle Scholar
- Gelfand AE, Banerjee S, Finley A (2012) Spatial design for knot selection in knot-based dimension reduction models. In: Mateu JM, Mueller W (eds) Spatio-temporal design: Advances in efficient data acquisition. Wiley, pp 142–169Google Scholar
- Higdon D (1998) A process-convolution approach to modelling temperatures in the North Atlantic Ocean. Environ Ecol Stat 5(2):173–190Google Scholar
- Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092Google Scholar
- Nieto-Barajas L, Müller P, Ji Y, Lu Y, Mills G (2008) Time series dependent Dirichlet process. PreprintGoogle Scholar
- Sahr K, White D, Kimerling AJ (2003) Geodesic discrete global grid systems. Cartogr Geogr Inf Sci 30(2):121–134Google Scholar