Abstract
In this paper, we extend and analyze a Bayesian hierarchical spatiotemporal model for physical systems. A novelty is to model the discrepancy between the output of a computer simulator for a physical process and the actual process values with a multivariate random walk. For computational efficiency, linear algebra for bandwidth limited matrices is utilized, and first-order emulator inference allows for the fast emulation of a numerical partial differential equation (PDE) solver. A test scenario from a physical system motivated by glaciology is used to examine the speed and accuracy of the computational methods used, in addition to the viability of modeling assumptions. We conclude by discussing how the model and associated methodology can be applied in other physical contexts besides glaciology.
Similar content being viewed by others
References
Baum, L. E. and Petrie, T. (1966), “Statistical Inference for Probabilistic Functions of Finite State Markov Chains,” Annals of Mathematical Statistics, 37, 1554–1563, https://doi.org/10.1214/aoms/1177699147.
Berliner, L. M. (1996), “Hierarchical Bayesian Time Series Models,” in Hanson, K. M. and Silver, R. N. (editors), Maximum Entropy and Bayesian Methods, Dordrecht: Springer Netherlands.
— (2003), “Physical-statistical modeling in geophysics,” Journal of Geophysical Research: Atmospheres, 108, n/a–n/a, https://doi.org/10.1029/2002JD002865. 8776.
Berrocal, V., Gelfand, A., and Holland, D. (2014), “Assessing exceedance of ozone standards: a space-time downscaler for fourth highest ozone concentrations,” Environmetrics, 25, 279–291.
Björnsson, H. and Pálsson, F. (2008), “Icelandic glaciers,” Jökull, 58, 365–386.
Breiman, L. (2001), “Random Forests,” Machine Learning, 45, 5–32, https://doi.org/10.1023/A:1010933404324.
Brinkerhoff, D. J., Aschwanden, A., and Truffer, M. (2016), “Bayesian Inference of Subglacial Topography Using Mass Conservation,” Frontiers in Earth Science, 4, 8, http://journal.frontiersin.org/article/10.3389/feart.2016.00008.
Brynjarsdóttir, J. and O’Hagan, A. (2014), “Learning about physical parameters: the importance of model discrepancy,” Inverse Problems, 30, 114007, http://stacks.iop.org/0266-5611/30/i=11/a=114007.
Bueler, E., Lingle, C. S., Kallen-Brown, J. A., Covey, D. N., and Bowman, L. N. (2005), “Exact solutions and verification of numerical models for isothermal ice sheets,” Journal of Glaciology, 51, 291–306.
Calderhead, B., Girolami, M., and Lawrence, N. D. (2008), “Accelerating Bayesian Inference over Nonlinear Differential Equations with Gaussian Processes,” in Proceedings of the 21st International Conference on Neural Information Processing Systems, NIPS’08, USA: Curran Associates Inc., http://dl.acm.org/citation.cfm?id=2981780.2981808.
Chkrebtii, O. A., Campbell, D. A., Calderhead, B., Girolami, M. A., et al. (2016), “Bayesian Solution Uncertainty Quantification for Differential Equations,” Bayesian Analysis, 11, 1239–1267.
Conrad, P. R., Girolami, M., Särkkä, S., Stuart, A., and Zygalakis, K. (2017), “Statistical analysis of differential equations: introducing probability measures on numerical solutions,” Statistics and Computing, 27, 1065–1082.
Cressie, N. and Wikle, C. K. (2011), Statistics for Spatio-Temporal Data, John Wiley & Sons.
Cuffey, K. M. and Paterson, W. (2010), The Physics of Glaciers, Academic Press, 4 edition.
Flowers, G. E., Marshall, S. J., Björnsson, H., and Clarke, G. K. (2005), “Sensitivity of Vatnajökull ice cap hydrology and dynamics to climate warming over the next 2 centuries,” Journal of Geophysical Research: Earth Surface, 110.
Fowler, A. C. and Larson, D. A. (1978), “On the Flow of Polythermal Glaciers. I. Model and Preliminary Analysis,” Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, 363, 217–242, http://www.jstor.org/stable/79748.
Friedman, J., Hastie, T., and Tibshirani, R. (2001), The Elements of Statistical Learning, volume 1, Springer series in statistics. New York, NY, NY, USA.
Geirsson, Ó. P., Hrafnkelsson, B., and Simpson, D. (2015), “Computationally efficient spatial modeling of annual maximum 24-h precipitation on a fine grid,” Environmetrics, 26, 339–353, https://onlinelibrary.wiley.com/doi/abs/10.1002/env.2343.
Golub, G. H. and Van Loan, C. F. (2012), Matrix Computations, volume 3, Johns Hopkins University Press.
Gopalan, G., Hrafnkelsson, B., Adalgeirsdóttir, G., Jarosch, A. H., and Pálsson, F. (2018), “A Bayesian hierarchical model for glacial dynamics based on the shallow ice approximation and its evaluation using analytical solutions,” The Cryosphere, 12, 2229–2248.
Guan, Y., Haran, M., and Pollard, D. (2016), “Inferring Ice Thickness from a Glacier Dynamics Model and Multiple Surface Datasets,” ArXiv e-prints.
Gupta, A. and Kumar, V. (1994), “A scalable parallel algorithm for sparse Cholesky factorization,” in Proceedings of the 1994 ACM/IEEE Conference on Supercomputing, Supercomputing ’94, Los Alamitos, CA, USA: IEEE Computer Society Press, http://dl.acm.org/citation.cfm?id=602770.602898.
Higdon, D., Gattiker, J., Williams, B., and Rightley, M. (2008), “Computer Model Calibration Using High-Dimensional Output,” Journal of the American Statistical Association, 103, 570–583.
Higdon, D., Kennedy, M., Cavendish, J. C., Cafeo, J. A., and Ryne, R. D. (2004), “Combining Field Data and Computer Simulations for Calibration and Prediction,” SIAM Journal on Scientific Computing, 26, 448–466.
Hooten, M. B., Leeds, W. B., Fiechter, J., and Wikle, C. K. (2011), “Assessing First-Order Emulator Inference for Physical Parameters in Nonlinear Mechanistic Models,” Journal of Agricultural, Biological, and Environmental Statistics, 16, 475–494, https://doi.org/10.1007/s13253-011-0073-7.
Hutter, K. (1982), “A mathematical model of polythermal glaciers and ice sheets,” Geophysical & Astrophysical Fluid Dynamics, 21, 201–224, https://doi.org/10.1080/03091928208209013.
Kennedy, M. C. and O’Hagan, A. (2001), “Bayesian calibration of computer models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 425–464.
Kusnierczyk, W. (2012), rbenchmark: Benchmarking routine for R, https://CRAN.R-project.org/package=rbenchmark. R package version 1.0.0.
Lehmann, E. and Casella, G. (2003), Theory of Point Estimation, Springer Texts in Statistics, Springer New York, https://books.google.com/books?id=0q-Bt0Ar-sgC.
Liaw, A. and Wiener, M. (2002), “Classification and Regression by randomForest,” R News, 2, 18–22, https://CRAN.R-project.org/doc/Rnews/.
Lindgren, F., Rue, H., and Lindström, J. (2011), “An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 423–498.
Liu, F. and West, M. (2009), “A dynamic modelling strategy for Bayesian computer model emulation,” Bayesian Analysis, 4, 393–411, https://doi.org/10.1214/09-BA415.
Madsen, H. (2007), Time Series Analysis, Chapman and Hall/CRC.
Murray, I., Adams, R. P., and MacKay, D. J. (2010), “Elliptical slice sampling”, Journal of Machine Learning Research W&CP, 9, 541–548.
Owhadi, H. and Scovel, C. (2017), “Universal Scalable Robust Solvers from Computational Information Games and fast eigenspace adapted Multiresolution Analysis”, ArXiv e-prints.
Pagendam, D., Kuhnert, P., Leeds, W., Wikle, C., Bartley, R., and Peterson, E. (2014), “Assimilating catchment processes with monitoring data to estimate sediment loads to the Great Barrier Reef,” Environmetrics, 25, 214–229.
Payne, A. J., Huybrechts, P., Abe-Ouchi, A., Calov, R., Fastook, J. L., Greve, R., Marshall, S. J., Marsiat, I., Ritz, C., Tarasov, L., and Thomassen, M. P. A. (2000), “Results from the EISMINT model intercomparison: the effects of thermomechanical coupling,” Journal of Glaciology, 46, 227–238.
Robert, C. (2007), The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation, Springer Texts in Statistics, Springer New York, https://books.google.com/books?id=NQ5KAAAAQBAJ.
Rue, H. (2001), “Fast sampling of Gaussian Markov random fields,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 325–338.
Rue, H. and Held, L. (2005), Gaussian Markov Random Fields: Theory and Applications, CRC press.
Salter, J. M., Williamson, D. B., Scinocca, J., and Kharin, V. (2019), “Uncertainty Quantification for Computer Models With Spatial Output Using Calibration-Optimal Bases,” Journal of the American Statistical Association, 0, 1–24, https://doi.org/10.1080/01621459.2018.1514306.
Shen, X. and Wasserman, L. (2001), “Rates of convergence of posterior distributions,” Annals of Statistics, 29, 687–714, https://doi.org/10.1214/aos/1009210686.
Sigurdarson, A. N. and Hrafnkelsson, B. (2016), “Bayesian prediction of monthly precipitation on a fine grid using covariates based on a regional meteorological model,” Environmetrics, 27, 27–41, https://ideas.repec.org/a/wly/envmet/v27y2016i1p27-41.html.
Solin, A. and Särkkä, S. (2014), “Explicit Link Between Periodic Covariance Functions and State Space Models,” in Kaski, S. and Corander, J. (editors), Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, volume 33 of Proceedings of Machine Learning Research, Reykjavik, Iceland: PMLR, http://proceedings.mlr.press/v33/solin14.html.
van der Vaart, A. (2000), Asymptotic Statistics, Asymptotic Statistics, Cambridge University Press, https://books.google.com/books?id=UEuQEM5RjWgC.
van der Veen, C. (2013), Fundamentals of Glacier Dynamics, CRC Press, 2 edition.
Whittle, P. (1954), “ON STATIONARY PROCESSES IN THE PLANE,” Biometrika, 434–449.
— (1963), “Stochastic processes in several dimensions,” Bulletin of the International Statistical Institute, 40, 974–994.
Wikle, C. K. (2016), Hierarchical Models for Uncertainty Quantification: An Overview, Springer International Publishing, 1–26.
Wikle, C. K., Berliner, L. M., and Cressie, N. (1998), “Hierarchical Bayesian space-time models,” Environmental and Ecological Statistics, 5, 117–154, https://doi.org/10.1023/A:1009662704779.
Wikle, C. K., Milliff, R. F., Nychka, D., and Berliner, L. M. (2001), “Spatiotemporal Hierarchical Bayesian Modeling Tropical Ocean Surface Winds,” Journal of the American Statistical Association, 96, 382–397.
Zammit-Mangion, A., Rougier, J., Bamber, J., and Schön, N. (2014), “Resolving the Antarctic contribution to sea-level rise: a hierarchical modelling framework,” Environmetrics, 25, 245–264.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
A Derivation of the Exact Likelihood and Computational Simplifications
As is shown in Appendix B of Gopalan et al. (2018), the covariance matrix for the observed data can be written as \(\text {U} \otimes \text {V} + \sigma ^2\text {I}\), where \(\text {U}_{ab} = k \min (a,b)\) with \(\text {U} \in \mathbb {R}^{N \times N}\), and \(\text {V} = \text {A}(\Sigma )\text {A}^{\intercal }\). It can be verified that \(\text {U}^{-1}\) is tridiagonal, so it has bandwidth one—more specifically:
One useful property of the Kronecker product is that \((\text {U} \otimes \text {V})^{-1} = \text {U}^{-1} \otimes \text {V}^{-1}\). Therefore:
whose bandwidth is O(m).
Let us denote \(\text {U} \otimes \text {V}\) as \(\text {W}\). By the matrix inversion lemma, it follows that \((\sigma ^2\text {I}+\text {W})^{-1} = \sigma ^{-2}\text {I}-\sigma ^{-2}(\text {W}^{-1}+\sigma ^{-2}\text {I})^{-1}\text {I}\sigma ^{-2}\). The matrix \(\text {W}^{-1}+\sigma ^{-2}\text {I}\) has bandwidth O(m) since \(\text {W}^{-1}\) has bandwidth O(m) as shown previously, so this expression can be computed in \(O(Nm^3)\) (Rue 2001; Golub and Van Loan 2012).
Similarly, by the matrix determinant lemma, \(\text {log}[\text {det}(\sigma ^2\text {I}+\text {W})]\) is \(\text {log}[\text {det}(\text {I}+\sigma ^2\text {W}^{-1})\text {det}(\text {W}^{-1})^{-1}]\) = \(\text {log}[\text {det}(\text {I}+\sigma ^2\text {W}^{-1})]\)-\(\text {log}[\text {det}(\text {W}^{-1})]\). Since both terms are log determinants of square matrices of dimension Nm and bandwidth O(m), this can be calculated in \(O(Nm^3)\) due to the efficient Cholesky factorization of band-limited matrices (Rue 2001; Golub and Van Loan 2012).
B First-Order Spatiotemporal Emulators
In the examples of this paper, the function \(\varvec{f}(.,.,.)\) (i.e., the computer simulator) can take one of the two forms: a numerical PDE solver for the SIA, or an emulator constructed from the numerical PDE solver for the SIA. The numerical method for solving the SIA PDE is as given in Gopalan et al. (2018), and the emulator is constructed based on the finite difference solver in a manner as suggested in Hooten et al. (2011), termed first-order emulation.
That is, we start with a set of plausible values for ice viscosity: \(\{\theta _1,\theta _2,\ldots ,\theta _p\}\) and, for each time point there is collected data ck, we store a matrix \(\text {M}_{ck}\), where the \(q\text {th}\) column of matrix \(\text {M}_{ck}\) is the output of the numerical solver using parameter value \(\theta _q\) after running for ck time steps forward. Thus, each matrix \(\text {M}_{ck}\) is of dimension n by p, and without essential loss of generality, we can assume that the number n is much larger than p, and each matrix \(\text {M}_{ck}\) is of rank p.
For each matrix, \(\text {M}_{ck}\), we compute a singular value decomposition (SVD), \(\text {U}_{ck}\text {D}_{ck}\text {V}^{\intercal }_{ck}\). The goal is to find a (vector-valued) function \(v_{ck}(\theta *)\) such that the emulated output at time ck for parameter value \(\theta *\) is \(\text {U}_{ck}\text {D}_{ck}v_{ck}(\theta *)\). To find the \(q\text {th}\) element of \(v_{ck}\), we train a random forest (Breiman 2001; Liaw and Wiener 2002) with \((\theta _1, (V^{\intercal }_{ck})_{q1}), (\theta _2, (V^{\intercal }_{ck})_{q2}),\ldots ,(\theta _p, (V^{\intercal }_{ck})_{qp})\) as training data, where \((V^{\intercal }_{ck})_{q1}\) is the first element of the \(q\text {th}\) right singular vector, \((V^{\intercal }_{ck})_{q2}\) is the second element of the \(q\text {th}\) right singular vector, and so on. Not all of the right singular vectors need be used in emulation, and a heuristic such as an elbow–scree plot or the randomization procedure of Friedman et al. (2001) can be used to determine the number of right singular vectors to keep. However, if the number of simulator runs (p) is much smaller than the dimensionality of the output (n), all of the right singular vectors can be utilized with computational savings, as is done in the experiments of this paper.
We have assumed the initial conditions and boundary conditions are known, since this is the case in the glaciology problems we have studied, where the boundary condition is that glacial thickness is nonnegative, and the initial glacier profile (i.e., a dome) is known. In general, however, \(\varvec{\phi }\) may be incorporated into the analysis above by considering \(\theta \) and \(\varvec{\phi }\) jointly. Additionally, a variant is to directly emulate the likelihood function. However, since there is flexibility in the choice of \(\Sigma \) (which enters into the likelihood), unless one is set on using a particular value of \(\Sigma \), it is sensible to emulate the numerical solver as opposed to retraining a likelihood emulator for each potential choice of \(\Sigma \).
Rights and permissions
About this article
Cite this article
Gopalan, G., Hrafnkelsson, B., Wikle, C.K. et al. A Hierarchical Spatiotemporal Statistical Model Motivated by Glaciology. JABES 24, 669–692 (2019). https://doi.org/10.1007/s13253-019-00367-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13253-019-00367-1