Uncertainty Quantification Using the Nearest Neighbor Gaussian Process

Shi, Hongxiang; Kang, Emily L.; Konomi, Bledar A.; Vemaganti, Kumar; Madireddy, Sandeep

doi:10.1007/978-3-319-69416-0_6

Hongxiang Shi⁹,
Emily L. Kang⁹,
Bledar A. Konomi⁹,
Kumar Vemaganti¹⁰ &
…
Sandeep Madireddy¹¹

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

1801 Accesses
1 Citations
1 Altmetric

Abstract

Gaussian process has been widely used in areas including geostatistics and uncertainty quantification due to its parsimonious yet flexible representation of a stochastic process. However, analyzing a large data set with Gaussian process can be challenging due to its O(n ³) computational complexity, where n denotes the size of the data set. The recently proposed Nearest Neighbor Gaussian Process (NNGP) aims to approximate a Gaussian process with a target covariance function by using a series of conditional distributions and then exploiting the sparse precision matrices. We demonstrate that NNGP has the potential to be used for uncertainty quantification. We discover that when using NNGP to approximate a Gaussian process with strong smoothness, e.g., the squared-exponential covariance function, Bayesian inference needs to be carried out carefully with marginalizing over the random effects in NNGP. Using simulated and real data, we investigate empirically the performance of NNGP to approximate the squared-exponential covariance function as well as its ability to handle change-of-support effect, a common phenomenon in geostatistics and uncertainty quantification when only aggregated data over space are available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Arendt, P. D., Apley, D. W., & Chen, W. (2012). Quantification of model uncertainty: Calibration, model discrepancy, and identifiability. Journal of Mechanical Design, 134, 100908-100908-12.
Google Scholar
Banerjee, S., Carlin, B. P., & Gelfand, A. E. (2014). Hierarchical modeling and analysis for spatial data. Boca Raton: CRC Press.
Google Scholar
Banerjee, S., Gelfand, A. E., Finley, A. O., & Sang, H. (2008). Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society B, 70, 825–848.
Google Scholar
Berrocal, V. J., Gelfand, A. E., & Holland, D. M. (2010). A spatio-temporal downscaler for output from numerical models. Journal of Agricultural, Biological, and Environmental Statistics, 15, 176–197.
Google Scholar
Bush, A., Gibson, R., & Thomas, T. (1975). The elastic contact of a rough surface. Wear, 35, 87–111.
Google Scholar
Craig, P. S., Goldstein, M., Rougier, J. C., & Seheult, A. H. (2001). Bayesian forecasting for complex systems using computer simulators. Journal of the American Statistical Association, 96, 717–729.
Google Scholar
Cressie, N. (1993). Statistics for spatial data, revised ed. New York: Wiley.
Google Scholar
Cressie, N. (1996). Change of support and the modifiable areal unit problem. Geographical Systems, 3, 159–180.
Google Scholar
Cressie, N., & Johannesson, G. (2008). Fixed rank kriging for very large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70, 209–226.
Article MathSciNet MATH Google Scholar
Cressie, N., Shi, T., & Kang, E. K. (2010). Fixed rank filtering for spatio-temporal data. Journal of Computational and Graphical Statistics, 19, 724–745.
Article MathSciNet Google Scholar
Crevillen-Garcia, D., Wilkinson, R. D., Shah, A. A., & Power, H. (2017). Gaussian process modelling for uncertainty quantification in convectively-enhanced dissolution processes in porous media. Advances in Water Resources, 99, 1–14.
Article Google Scholar
Currin, C., Mitchell, T, Morris, M., & Ylvisaker, D. (1988). A Bayesian approach to the design and analysis of computer experiments. Technical Report, ORNL498, Oak Ridge Laboratory.
Google Scholar
Datta, A., Banerjee, S., Finley, A. O., & Gelfand, A. E. (2016). Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association, 111, 800–812.
Article MathSciNet Google Scholar
Emery, X. (2009). The kriging update equations and their application to the selection of neighboring data. Computational Geosciences, 13, 269–280.
Article Google Scholar
Furrer, R., Genton, M. G., & Nychka, D. (2006). Covariance tapering for interpolation of large spatial datasets. Journal of Computational and Graphical Statistics, 15, 502–523.
Article MathSciNet Google Scholar
Gneiting, T., Kleiber, W., & Schlather, M. (2010). Matérn cross-covariance functions for multivariate random fields. Journal of the American Statistical Association, 105, 1167–1177.
Article MathSciNet MATH Google Scholar
Goulard, M., & Voltz, M. (1992). Linear coregionalization model: Tools for estimation and choice of cross-variogram matrix. Mathematical Geology, 24, 269–286.
Article Google Scholar
Gramacy, R. B., & Apley, D. W. (2015). Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics, 24, 561–578.
Article MathSciNet Google Scholar
Gramacy, R. B., & Lee, H. K. H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103, 1119–1130.
Article MathSciNet MATH Google Scholar
Greenwood, J. A., & Williamson, J. B. P. (1966). Contact of nominally flat surfaces. In Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences. The Royal Society (Vol. 295, pp. 300–319).
Google Scholar
Guttorp, P., & Gneiting, T. (2006). Studies in the history of probability and statistics XLIX: On the Matérn correlation family. Biometrika, 93, 989–995.
Article MathSciNet MATH Google Scholar
Higdon, D., Nakhleh, C., Gattiker, J., & Williams, B. (2008). A Bayesian calibration approach to the thermal problem. Computer Methods in Applied Mechanics and Engineering, 1976, 2431–2441.
Article MATH Google Scholar
Kaufman, C. G., & Shaby, B. A. (2013). The role of the range parameter for estimation and prediction in geostatistics. Biometrika, 100, 473–484.
Article MathSciNet MATH Google Scholar
Kennedy, M. C., & O’Hagan, A. (2000). Predicting the output from a complex computer code when fast approximations are available. Biometrika, 87, 1–13.
Article MathSciNet MATH Google Scholar
Kennedy, M. C., & O’Hagan, A. (2001). Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 425–464.
Article MathSciNet MATH Google Scholar
Konomi, B., Sang, H., & Mallick, B. (2014). Adaptive Bayesian nonstationary modeling for large spatial datasets using covariance approximations. Journal of Computational and Graphical Statistics, 23, 802–829.
Article MathSciNet Google Scholar
Liu, F., Bayarri, M. J., & Berger, J. O. (2009). Modularization in Bayesian analysis, with emphasis on analysis of computer models. Bayesian Analysis, 4, 119–150.
Article MathSciNet MATH Google Scholar
Nguyen, H., Cressie, N., & Braverman, A. (2012). Spatial statistical data fusion for remote sensing applications. Journal of the American Statistical Association, 107, 1004–1018.
Article MathSciNet MATH Google Scholar
Ohio Supercomputer Center (OSC). (1987). Columbus OH: Ohio Supercomputer Center. http://osc.edu/ark:/19495/f5s1ph73
Peng, C. Y., & Wu, J. (2004). On the choice of nugget in kriging modeling for deterministic computer experiments. Journal of Computational and Graphical Statistics, 23, 151–168.
Article MathSciNet Google Scholar
Perdikaris, P., Venturi, D., Royset, J. O., & Karniadakis, G. E. (2015). Multi-fidelity modelling via recursive co-kriging and Gaussian Markov random fields. Proceedings of the Royal Society of London A, 471, 20150018.
Article Google Scholar
Qian, P. Z. G., Wu, H., & Wu, C. F. J. (2008). Gaussian process Models for computer experiments with qualitative and quantitative factors. Technometrics, 50, 383–396.
Article MathSciNet Google Scholar
Rue, H., & Held, L. (2005). Gaussian Markov random fields: Theory and applications. Boca Raton: Chapman and Hall.
Book MATH Google Scholar
Sacks, J., Welch, W. J., Mitchell, T. J., & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science, 4, 409–423.
Article MathSciNet MATH Google Scholar
Santner, T. J., Williams, B. J., & Notz, W. I. (2013). The design and analysis of computer experiments. New York: Springer Science & Business Media.
MATH Google Scholar
Sista, B., & Vemaganti, K. (2014). Estimation of statistical parameters of rough surfaces suitable for developing micro-asperity friction models. Wear, 316, 6–18.
Article Google Scholar
Stein, M. L. (1999). Interpolation of spatial data: Some theory for kriging. New York: Springer.
Book MATH Google Scholar
Tworzydlo, W. W., Cecot, W., Oden, J. T., & Yew, C. H. (1988). Computational micro-and macroscopic models of contact and friction: Formulation, approach and applications. Wear, 220, 113–140.
Article Google Scholar
Wackernagel, H. (2003). Multivariate geostatistics: An introduction with applications, 3rd ed. Berlin: Springer.
Book MATH Google Scholar
Zaytsev, V., Biver, P., Wachernagel, H., & Allard, D. (2016). Change-of-support models on irregular grids for geostatistical simulation. Mathematical Geosciences, 48, 353–369.
Article MathSciNet Google Scholar
Zhou, Q., Qian, P. Z. G., & Zhou, S. (2011). A simple approach to emulation for computer models with qualitative and quantitative factors. Technometrics, 53, 266–273.
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported in part by an allocation of computing time from the Ohio Supercomputer Center (OSC 1987). Shi’s research was supported by the Taft Research Center at the University of Cincinnati. Kang’s research was partially supported by the Simons Foundation’s Collaboration Award (#317298) and the Taft Research Center at the University of Cincinnati. Vemaganti’s work was partially supported by the University of Cincinnati Simulation Center.

Author information

Authors and Affiliations

Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH, USA
Hongxiang Shi, Emily L. Kang & Bledar A. Konomi
Department of Mechanical and Materials Engineering, University of Cincinnati, Cincinnati, OH, USA
Kumar Vemaganti
Argonne National Laboratory, Lemont, IL, USA
Sandeep Madireddy

Authors

Hongxiang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Emily L. Kang
View author publications
You can also search for this author in PubMed Google Scholar
Bledar A. Konomi
View author publications
You can also search for this author in PubMed Google Scholar
Kumar Vemaganti
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Madireddy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Emily L. Kang .

Editor information

Editors and Affiliations

University of North Carolina, Chapel Hill, North Carolina, USA
Ding-Geng Chen
Columbia University, New York, New York, USA
Zhezhen Jin
University of California, Los Angeles, California, USA
Gang Li
University of Michigan-Ann Arbor, Ann Arbor, Michigan, USA
Yi Li
National Institutes of Health, Bethesda, Maryland, USA
Aiyi Liu
Georgia State University, Atlanta, Georgia, USA
Yichuan Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shi, H., Kang, E.L., Konomi, B.A., Vemaganti, K., Madireddy, S. (2017). Uncertainty Quantification Using the Nearest Neighbor Gaussian Process. In: Chen, DG., Jin, Z., Li, G., Li, Y., Liu, A., Zhao, Y. (eds) New Advances in Statistics and Data Science. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-69416-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-69416-0_6
Published: 18 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69415-3
Online ISBN: 978-3-319-69416-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics