Abstract
In this paper, the effect of dimensionality on the supervised learning of infinitely differentiable regression functions is analyzed. By invoking the Van Trees lower bound, we prove lower bounds on the generalization error with respect to the number of samples and the dimensionality of the input space both in a linear and non-linear context. It is shown that in non-linear problems without prior knowledge, the curse of dimensionality is a serious problem. At the same time, we speculate counter-intuitively that sometimes supervised learning becomes plausible in the asymptotic limit of infinite dimensionality.
Similar content being viewed by others
References
Tsitsiklis JN, Bertsekas DP (1996) Neuro-dynamic programming. Optimization and neural computation, vol 3. Athena Scientific, Belmont
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. Adaptive computation and machine learning. MIT Press, Cambridge
Stone CJ (1982) Optimal global rates of convergence for nonparametric regression. Ann Stat 10(4): 1040–1053
Van Trees H (2001) Detection, estimation, and modulation theory, part I. Wiley-Interscience, New York
Bobrovsky BZ, Mayer-Wolf E, Zakai M (1987) Some classes of global Cramer–Rao bounds. Ann Stat 15(4): 1421–1438
Golubev YK, Levit BY, Tsybakov AB (1996) Some classes of global Cramer–Rao bounds. Bernoulli 2(2): 167–181
Guerre E, Tsybakov AB (1998) Exact asymptotic minimax constants for the estimation of analytic functions in l p . Probab Theory Relat Fields 2(112): 33–51
Ibragimov I (2001) Estimation of analytic functions. Lect Notes Monogr Ser 36: 359–383
Artiles LM, Levit BY (2003) Adaptive regression on the real line in classes of smooth functions. Aust J Stat 32(1–2): 99–129
Belitser E, Levit B (2003) On the empirical Bayes approach to adaptive filtering. Math Methods Stat 12(2): 131–154
Barron A (1993) Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inf Theory 39(3): 930–945
Verleysen M (2003) Learning high-dimensional data. In: Ablameyko S, Goras L, Gori M, Piuri V (eds) Limitations and future trends in neural computation. IOS Press, Amsterdam, pp 141–162
Verleysen M, Francois D (2005) The curse of dimensionality in data mining and time series prediction. In: Cabestany J, Prieto A, Sandoval F (eds) Computational intelligence and bioinspired systems. Lecture notes in computer science, vol 3512. Springer, Berlin, pp 758–770
Cramer H (1946) Mathematical methods of statistics. Princeton University Press, Princeton
Boucheron S, Bousquet O, Lugosi G (2004) Concentration inequalities. In: Bousquet O, Luxburg UV, Rätsch G (eds) Advanced lectures in machine learning. Springer, Berlin, pp 208–240
Rasmussen C, Williams C (2006) Gaussian processes for machine learning. MIT Press, Cambridge
Sorjamaa A, Hao J, Reyhani N, Ji Y, Lendasse A (2007) Methodology for long-term prediction of time series. Neurocomputing 70(16–18): 2861–2869
Guillén A, Sovilj D, Mateo F, Rojas I, Lendasse A (2008) Minimizing the Delta test for variable selection in regression problems. Int J High Perform Syst Archit 1(4): 269–281
Kreyszig E (1989) Introductory functional analysis with applications, 1 edn. Wiley, New York
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liitiäinen, E., Corona, F. & Lendasse, A. On the Curse of Dimensionality in Supervised Learning of Smooth Regression Functions. Neural Process Lett 34, 133–154 (2011). https://doi.org/10.1007/s11063-011-9188-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-011-9188-7