On the Convergence Rate of Sparse Grid Least Squares Regression

  • Bastian BohnEmail author
Conference paper
Part of the Lecture Notes in Computational Science and Engineering book series (LNCSE, volume 123)


While sparse grid least squares regression algorithms have been frequently used to tackle Big Data problems with a huge number of input data in the last 15 years, a thorough theoretical analysis of stability properties, error decay behavior and appropriate couplings between the dataset size and the grid size has not been provided yet. In this paper, we will present a framework which will allow us to close this gap and rigorously derive upper bounds on the expected error for sparse grid least squares regression. Furthermore, we will verify that our theoretical convergence results also match the observed rates in numerical experiments.



The author was supported by the Sonderforschungsbereich 1060 The Mathematics of Emergent Effects funded by the Deutsche Forschungsgemeinschaft.


  1. 1.
    A. Banerjee, S. Merugu, I.S. Dhillon, J. Ghosh, Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005)MathSciNetzbMATHGoogle Scholar
  2. 2.
    B. Bohn, Error analysis of regularized and unregularized least-squares regression on discretized function spaces. PhD thesis, Institute for Numerical Simulation, University of Bonn, 2017Google Scholar
  3. 3.
    B. Bohn, M. Griebel, An adaptive sparse grid approach for time series predictions, in Sparse Grids and Applications, ed. by J. Garcke, M. Griebel. Lecture Notes in Computational Science and Engineering, vol. 88 (Springer, Berlin, 2012), pp. 1–30Google Scholar
  4. 4.
    H.-J. Bungartz, M. Griebel, Sparse grids. Acta Numer. 13, 147–269 (2004)MathSciNetCrossRefGoogle Scholar
  5. 5.
    H.-J. Bungartz, D. Pflüger, S. Zimmer, Adaptive sparse grid techniques for data mining, in Modelling, Simulation and Optimization of Complex Processes 2006, Proceedings of International Conference on HPSC, Hanoi, ed. by H. Bock, E. Kostina, X. Hoang, R. Rannacher (Springer, Berlin, 2008), pp. 121–130Google Scholar
  6. 6.
    A. Chkifa, A. Cohen, G. Migliorati, F. Nobile, R. Tempone, Discrete least squares polynomial approximation with random evaluations - application to parametric and stochastic elliptic PDEs. ESAIM: Math. Modell. Numer. Anal. 49(3), 815–837 (2015)MathSciNetCrossRefGoogle Scholar
  7. 7.
    A. Cohen, M. Davenport, D. Leviatan, On the stability and accuracy of least squares approximations. Found. Comput. Math. 13, 819–834 (2013)MathSciNetCrossRefGoogle Scholar
  8. 8.
    C. Feuersänger, Sparse grid methods for higher dimensional approximation. PhD thesis, Institute for Numerical Simulation, University of Bonn, 2010Google Scholar
  9. 9.
    J. Garcke, Maschinelles Lernen durch Funktionsrekonstruktion mit verallgemeinerten dünnen Gittern. PhD thesis, Institute for Numerical Simulation, University of Bonn, 2004Google Scholar
  10. 10.
    J. Garcke, M. Griebel, M. Thess, Data mining with sparse grids. Computing 67(3), 225–253 (2001)MathSciNetCrossRefGoogle Scholar
  11. 11.
    M. Griebel, P. Oswald, Tensor product type subspace splitting and multilevel iterative methods for anisotropic problems. Adv. Comput. Math. 4, 171–206 (1995)MathSciNetCrossRefGoogle Scholar
  12. 12.
    M. Griebel, C. Rieger, B. Zwicknagl, Multiscale approximation and reproducing kernel Hilbert space methods. SIAM J. Numer. Anal. 53(2), 852–873 (2015)MathSciNetCrossRefGoogle Scholar
  13. 13.
    M. Griebel, C. Rieger, B. Zwicknagl, Regularized kernel based reconstruction in generalized Besov spaces. Found. Comput. Math. 18(2), 459–508 (2018)MathSciNetCrossRefGoogle Scholar
  14. 14.
    T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning (Springer, Berlin, 2001)CrossRefGoogle Scholar
  15. 15.
    M. Hegland, Data mining techniques. Acta Numer. 10, 313–355 (2001)MathSciNetCrossRefGoogle Scholar
  16. 16.
    S. Knapek, Approximation und Kompression mit Tensorprodukt-Multiskalenräumen. PhD thesis, Institute for Numerical Simulation, University of Bonn, 2000Google Scholar
  17. 17.
    G. Migliorati, F. Nobile, E. von Schwerin, R. Tempone, Analysis of discrete L 2 projection on polynomial spaces with random evaluations. Found. Comput. Math. 14, 419–456 (2014)MathSciNetzbMATHGoogle Scholar
  18. 18.
    G. Migliorati, F. Nobile, R. Tempone, Convergence estimates in probability and in expectation for discrete least squares with noisy evaluations at random points. J. Multivar. Anal. 142, 167–182 (2015)MathSciNetCrossRefGoogle Scholar
  19. 19.
    D. Pflüger, B. Peherstorfer, H.-J. Bungartz, Spatially adaptive sparse grids for high-dimensional data-driven problems. J. Complexity 26(5), 508–522 (2010)MathSciNetCrossRefGoogle Scholar
  20. 20.
    B. Schölkopf, A. Smola, Learning with Kernels – Support Vector Machines, Regularization, Optimization, and Beyond. (The MIT Press, Cambridge, 2002)Google Scholar
  21. 21.
    J. Tropp, User-friendly tail bounds for sums of random matrices. Found. Comput. Math. 12(4), 389–434 (2011)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Institute for Numerical SimulationUniversity of BonnBonnGermany

Personalised recommendations