Skip to main content

Scalable Kernel Methods for Uncertainty Quantification

  • Chapter
  • 1003 Accesses

Part of the book series: Lecture Notes in Computational Science and Engineering ((LNCSE,volume 105))

Abstract

Kernel methods are a broad class of algorithms that find application in approximation theory and non-parametric statistics. In this article, we review the literature with a focus on methods for uncertainty quantification and we discuss computational challenges related to kernel methods. In particular, we focus on approximating kernel matrices, one of the main computational bottlenecks in kernel methods. The most popular method for constructing approximations of kernel matrices is the Nystrom method, which uses randomized sampling to construct a low-rank factorization of a kernel matrix. We present a parallel implementation of the Nystrom method using the Elemental parallel linear algebra library and discuss an efficient variant called the one-shot Nystrom method. We conclude with examples of a regression problems for binary classification in high dimensions that illustrate the capabilities and limitations of Nystrom methods. In our largest test, we consider a dataset from high-energy physics in 28 dimensions with ten million points.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In mathematical physics, the kernel is the Green’s function of the partial differential equations (PDEs) that model the target application and the weights are the right-hand side of the PDE.

  2. 2.

    Throughout, we refer to a point \(\underline{x}_{i}\) for which we compute y i as a target and a point \(\underline{x}_{j}\) as a source with weight w j .

  3. 3.

    ASKIT stands for Approximate Skeletonization Kernel Independent Treecode.

  4. 4.

    For example, the intrinsic dimension of a set of points distributed on a curve in three dimensions is one.

  5. 5.

    We use the term interaction between two points \(\underline{x}_{i}\) and \(\underline{x}_{j}\) to refer to \(K(\underline{x}_{i},\underline{x}_{j})\).

References

  1. Alwan, A., Aluru, N.: Improved statistical models for limited datasets in uncertainty quantification using stochastic collocation. J. Comput. Phys. 255, 521–539 (2013)

    Article  MathSciNet  Google Scholar 

  2. Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D.W., O’Neil, M.: Fast direct methods for Gaussian processes and the analysis of NASA Kepler mission data. arXiv preprint (2014) [arXiv:1403.6015]

    Google Scholar 

  3. Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117 (2008)

    Article  Google Scholar 

  4. Bache, K., Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)

  5. Bardeen, J., Bond, J., Kaiser, N., Szalay, A.: The statistics of peaks of Gaussian random fields. Astrophys. J. 304, 15–61 (1986)

    Article  Google Scholar 

  6. Biegler, L., Biros, G., Ghattas, O., Marzouk, Y., Heinkenschloss, M., Keyes, D., Mallick, B., Tenorio, L., van Bloemen Waanders, B., Willcox, K. (eds.): Large-Scale Inverse Problems and Quantification of Uncertainty. Wiley, New York (2011)

    MATH  Google Scholar 

  7. Bilionis, I., Zabaras, N., Konomi, B.A., Lin, G.: Multi-output separable gaussian process: towards an efficient, fully bayesian paradigm for uncertainty quantification. J. Comput. Phys. 241, 212–239 (2013)

    Article  Google Scholar 

  8. Buhmann, M.D.: Radial Basis Functions: Theory and Implementations, vol. 12. Cambridge University Press, Cambridge (2003)

    Book  Google Scholar 

  9. Bungartz, H.J., Griebel, M.: Sparse grids. In: Acta Numerica, vol. 13, pp. 147–269. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  10. Camps-Valls, G., Bruzzone, L., et al.: Kernel Methods for Remote Sensing Data Analysis, vol. 26. Wiley, New York (2009)

    Book  MATH  Google Scholar 

  11. Cecil, T., Qian, J., Osher, S.: Numerical methods for high dimensional Hamilton-Jacobi equations using radial basis functions. J. Comput. Phys. 196(1), 327–347 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  12. Chen, J., Wang, L., Anitescu, M.: A fast summation tree code for Matérn kernel. SIAM J. Sci. Comput. 36(1), A289–A309 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  13. Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmon. Anal. 21(1), 5–30 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  14. Cubuk, E.D., Schoenholz, S.S., Rieser, J.M., Malone, B.D., Rottler, J., Durian, D.J., Kaxiras, E., Liu, A.J.: Identifying structural flow defects in disordered solids using machine-learning methods. Phys. Rev. Lett. 114, 108001 (2015). http://link.aps.org/doi/10.1103/PhysRevLett.114.108001

  15. Drineas, P., Mahoney, M.W.: On the nyström method for approximating a gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175 (2005)

    MATH  MathSciNet  Google Scholar 

  16. Elman, H.C., Miller, C.W.: Stochastic collocation with kernel density estimation. Comput. Methods Appl. Mech. Eng. 245–246, 36–46 (2012)

    Article  MathSciNet  Google Scholar 

  17. Evensen, G.: Data Assimilation: The Ensemble Kalman Filter. Springer, Heidelberg (2006)

    Google Scholar 

  18. Farrell, K., Oden, J.T.: Calibration and validation of coarse-grained models of atomic systems: application to semiconductor manufacturing. Comput. Mech. 54(1), 3–19 (2014)

    Article  MATH  Google Scholar 

  19. Fornberg, B., Piret, C.: A stable algorithm for flat radial basis functions on a sphere. SIAM J. Sci. Comput. 30(1), 60–80 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  20. Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nystrom method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)

    Article  Google Scholar 

  21. Friedman, J., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning. Springer Series in Statistics, vol. 1. Springer, New York (2001)

    Google Scholar 

  22. Gittens, A., Mahoney, M.: Revisiting the Nystrom method for improved large-scale machine learning. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 567–575 (2013)

    Google Scholar 

  23. Gorodetsky, A., Marzouk, Y.: Efficient localization of discontinuities in complex computational simulations. SIAM J. Sci. Comput. 36(6), A2584–A2610 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  24. Greengard, L.: Fast algorithms for classical physics. Science 265(5174), 909–914 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  25. Greengard, L., Rokhlin, V.: A fast algorithm for particle simulations. J. Comput. Phys. 73, 325–348 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  26. Greengard, L., Strain, J.: The fast Gauss transform. SIAM J. Sci. Stat. Comput. 12(1), 79–94 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  27. Griebel, M., Wissel, D.: Fast approximation of the discrete Gauss transform in higher dimensions. J. Sci. Comput. 55(1), 149–172 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  28. Hofmann, T., Schölkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36, 1171–1220 (2008)

    Article  MATH  Google Scholar 

  29. Klaas, M., Briers, M., De Freitas, N., Doucet, A., Maskell, S., Lang, D.: Fast particle smoothing: if I had a million particles. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 481–488. ACM, New York (2006)

    Google Scholar 

  30. Kress, R.: Linear Integral Equations. Applied Mathematical Sciences. Springer, New York (1999)

    Book  MATH  Google Scholar 

  31. Ma, X., Zabaras, N.: Kernel principal component analysis for stochastic input model generation. J. Comput. Phys. 230(19), 7311–7331 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  32. Mahoney, M.W.: Randomized algorithms for matrices and data. Found. Trends Mach. Learn. 3(2), 123–224 (2011)

    MATH  MathSciNet  Google Scholar 

  33. March, W.B., Biros, G.: Far-field compression for fast kernel summation methods in high dimensions, pp. 1–43 (2014) [arxiv.org/abs/1409.2802v1]

  34. March, W.B., Xiao, B., Biros, G.: ASKIT: approximate skeletonization kernel-independent treecode in high dimensions. SIAM J. Sci. Comput. 37(2), 1089–1110 (2015). http://dx.doi.org/10.1137/140989546

  35. March, W.B., Xiao, B., Tharakan, S., Yu, C.D., Biros, G.: Robust treecode approximation for kernel machines. In: Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Sydney, pp. 1–10 (2008). http://dx.doi.org/10.1145/2783258.2783272

  36. March, W.B., Xiao, B., Yu, C., Biros, G.: An algebraic parallel treecode in arbitrary dimensions. In: Proceedings of IPDPS 2015. 29th IEEE International Parallel and Distributed Processing Symposium, Hyderabad (2015). http://padas.ices.utexas.edu/static/papers/ipdps15askit.pdf

  37. Medina, J.C., Taflanidis, A.A.: Adaptive importance sampling for optimization under uncertainty problems. Comput. Methods Appl. Mech. Eng. 279, 133–162 (2014)

    Article  MathSciNet  Google Scholar 

  38. Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl. Comput. Harmon. Anal. 21(1), 113–127 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  39. Nichol, R., Sheth, R.K., Suto, Y., Gray, A., Kayo, I., Wechsler, R., Marin, F., Kulkarni, G., Blanton, M., Connolly, A., et al.: The effect of large-scale structure on the SDSS galaxy three-point correlation function. Mon. Not. R. Astron. Soc. 368(4), 1507–1514 (2006)

    Article  Google Scholar 

  40. Peherstorfer, B., Pflüger, D., Bungartz, H.J.: Density estimation with adaptive sparse grids for large data sets. In: Proceedings of the 2014 SIAM International Conference on Data Mining, pp. 443–451. Society for Industrial and Applied Mathematics, Philadelphia (2014)

    Google Scholar 

  41. Petra, N., Martin, J., Stadler, G., Ghattas, O.: A computational framework for infinite-dimensional Bayesian inverse problems, part II: stochastic Newton MCMC with application to ice sheet flow inverse problems. SIAM J. Sci. Comput. 36(4), A1525–A1555 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  42. Petschow, M., Peise, E., Bientinesi, P.: High-performance solvers for dense hermitian eigenproblems. SIAM J. Sci. Comput. 35(1), C1–C22 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  43. Poulson, J., Marker, B., van de Geijn, R.A., Hammond, J.R., Romero, N.A.: Elemental: a new framework for distributed memory dense matrix computations. ACM Trans. Math. Softw. 39(2), 13:1–13:24 (2013). http://doi.acm.org/10.1145/2427023.2427030

  44. Rasmussen, C.E., Williams, C.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)

    MATH  Google Scholar 

  45. Schaback, R., Wendland, H.: Kernel techniques: from machine learning to meshless methods. Acta Numer. 15, 543–639 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  46. Schölkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis. In: Artificial Neural Networks—ICANN’97, pp. 583–588. Springer, Heidelberg (1997)

    Google Scholar 

  47. Schwab, C., Todor, R.A.: Karhunen-Loeve approximation of random fields by generalized fast multipole methods. J. Comput. Phys. 217(1), 100–122 (2006). http://dx.doi.org/10.1016/j.jcp.2006.01.048

  48. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

  49. Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)

    Book  MATH  Google Scholar 

  50. Spivak, M., Veerapaneni, S.K., Greengard, L.: The fast generalized Gauss transform. SIAM J. Sci. Comput. 32(5), 3092–3107 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  51. Talmon, R., Coifman, R.R.: Intrinsic modeling of stochastic dynamical systems using empirical geometry. Appl. Comput. Harmon. Anal. 39(1), 138–160 (2015)

    Article  MathSciNet  Google Scholar 

  52. Talwalkar, A., Rostamizadeh, A.: Matrix coherence and the nystrom method. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence (UAI 2010) (2010)

    Google Scholar 

  53. Tarantola, A.: Inverse Problem Theory and Methods for Model Parameter Estimation. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2005)

    Google Scholar 

  54. Wan, X., Karniadakis, G.E.: Solving elliptic problems with non-gaussian spatially-dependent random coefficients. Comput. Methods Appl. Mech. Eng. 198(21–26), 1985–1995 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  55. Wasserman, L.: All of Statistics: A Concise Course in Statistical Inference. Springer, New York (2004)

    Book  Google Scholar 

  56. Weber, R., Schek, H., Blott, S.: A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Proceedings of the International Conference on Very Large Data Bases, pp. 194–205. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  57. Wendland, H.: Scattered Data Approximation, vol. 17. Cambridge University Press, Cambridge (2004)

    Book  Google Scholar 

  58. Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Proceedings of the 14th Annual Conference on Neural Information Processing Systems, pp. 682–688 (2001)

    Google Scholar 

  59. Xiao, B.: Parallel algorithms for the generalized n-body problem in high dimensions and their applications for bayesian inference and image analysis. Ph.D. thesis, Georgia Institute of Technology (2014)

    Google Scholar 

  60. Xiu, D.: Fast numerical methods for stochastic computations: a review. Commun. Comput. Phys. 5(2–4), 242–272 (2009)

    MathSciNet  Google Scholar 

  61. Ying, L., Biros, G., Zorin, D.: A kernel-independent adaptive fast multipole method in two and three dimensions. J. Comput. Phys. 196(2), 591–626 (2004)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

This material is based upon work supported by AFOSR grants FA9550-12-10484 and FA9550-11-10339; by NSF grants CCF-1337393, OCI-1029022; by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Numbers DE-SC0010518, DE-SC0009286, and DE- FG02-08ER2585; by NIH grant 10042242; and by the Technische Universität München—Institute for Advanced Study, funded by the German Excellence Initiative (and the European Union Seventh Framework Programme under grant agreement 291763). Any opinions, findings, and conclusions or recommendations expressed herein are those of the authors and do not necessarily reflect the views of the AFOSR or the NSF. Computing time on the Texas Advanced Computing Centers Stampede system was provided by an allocation from TACC and the NSF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Biros .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Tharakan, S., March, W.B., Biros, G. (2015). Scalable Kernel Methods for Uncertainty Quantification. In: Mehl, M., Bischoff, M., Schäfer, M. (eds) Recent Trends in Computational Engineering - CE2014. Lecture Notes in Computational Science and Engineering, vol 105. Springer, Cham. https://doi.org/10.1007/978-3-319-22997-3_1

Download citation

Publish with us

Policies and ethics