Skip to main content

Efficient Regression in Metric Spaces via Approximate Lipschitz Extension

  • Conference paper
Similarity-Based Pattern Recognition (SIMBAD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7953))

Included in the following conference series:

Abstract

We present a framework for performing efficient regression in general metric spaces. Roughly speaking, our regressor predicts the value at a new point by computing a Lipschitz extension — the smoothest function consistent with the observed data — while performing an optimized structural risk minimization to avoid overfitting. The offline (learning) and online (inference) stages can be solved by convex programming, but this naive approach has runtime complexity O(n 3), which is prohibitive for large datasets. We design instead an algorithm that is fast when the doubling dimension, which measures the “intrinsic” dimensionality of the metric space, is low. We make dual use of the doubling dimension: first, on the statistical front, to bound fat-shattering dimension of the class of Lipschitz functions (and obtain risk bounds); and second, on the computational front, to quickly compute a hypothesis function and a prediction based on Lipschitz extension. Our resulting regressor is both asymptotically strongly consistent and comes with finite-sample risk bounds, while making minimal structural and noise assumptions.

A full version appears as arXiv:1111.4470 [GKK11].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, N., Ben-David, S., Cesa-Bianchi, N., Haussler, D.: Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44(4), 615–631 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  2. Boucheron, S., Bousquet, O., Lugosi, G.: Theory of classification: A survey of recent advances. ESAIM Probab. Statist. 9, 323–375 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  3. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth Statistics/Probability Series. Wadsworth Advanced Books and Software, Belmont (1984)

    MATH  Google Scholar 

  4. Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: 23rd International Conference on Machine Learning, pp. 97–104. ACM (2006)

    Google Scholar 

  5. Cole, R., Gottlieb, L.-A.: Searching dynamic point sets in spaces with bounded doubling dimension. In: 38th Annual ACM Symposium on Theory of Computing, pp. 574–583 (2006)

    Google Scholar 

  6. Clarkson, K.L.: Nearest neighbor queries in metric spaces. Discrete Comput. Geom. 22(1), 63–93 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  7. Clarkson, K.: Nearest-neighbor searching and metric space dimensions. In: Shakhnarovich, G., Darrell, T., Indyk, P. (eds.) Nearest-Neighbor Methods for Learning and Vision: Theory and Practice, pp. 15–59. MIT Press (2006)

    Google Scholar 

  8. Devroye, L., Györfi, L., Lugosi, G.: A probabilistic theory of pattern recognition. Applications of Mathematics (New York), vol. 31. Springer, New York (1996)

    MATH  Google Scholar 

  9. Gottlieb, L.-A., Krauthgamer, R.: Proximity algorithms for nearly-doubling spaces. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds.) APPROX 2010. LNCS, vol. 6302, pp. 192–204. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Gottlieb, L.-A., Kontorovich, L., Krauthgamer, R.: Efficient classification for metric data. In: COLT, pp. 433–440 (2010)

    Google Scholar 

  11. Gottlieb, L.-A., Kontorovich, A., Krauthgamer, R.: Efficient regression in metric spaces via approximate Lipschitz extension (2011), http://arxiv.org/abs/1111.4470

  12. Gottlieb, L.-A., Kontorovich, A., Krauthgamer, R.: Adaptive metric dimensionality reduction (2013), http://arxiv.org/abs/1302.2752

  13. Györfi, L., Kohler, M., Krzyżak, A., Walk, H.: A distribution-free theory of nonparametric regression. Springer Series in Statistics. Springer, New York (2002)

    Book  MATH  Google Scholar 

  14. Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: FOCS, pp. 534–543 (2003)

    Google Scholar 

  15. Har-Peled, S., Mendel, M.: Fast construction of nets in low-dimensional metrics and their applications. SIAM Journal on Computing 35(5), 1148–1184 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kpotufe, S., Dasgupta, S.: A tree-based regressor that adapts to intrinsic dimension. Journal of Computer and System Sciences (2011) (to appear)

    Google Scholar 

  17. Krauthgamer, R., Lee, J.R.: Navigating nets: Simple algorithms for proximity search. In: 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 791–801 (January 2004), http://dl.acm.org/citation.cfm?id=982792.982913

  18. Kpotufe, S.: Fast, smooth and adaptive regression in metric spaces. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 1024–1032 (2009)

    Google Scholar 

  19. Kleinberg, J., Slivkins, A., Wexler, T.: Triangulation and embedding using small sets of beacons. J. ACM 56, 32:1–32:37 (2009)

    Google Scholar 

  20. Lafferty, J., Wasserman, L.: Rodeo: Sparse, greedy nonparametric regression. Ann. Stat. 36(1), 28–63 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  21. Lugosi, G., Zeger, K.: Nonparametric estimation via empirical risk minimization. IEEE Transactions on Information Theory 41(3), 677–687 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  22. McShane, E.J.: Extension of range of functions. Bull. Amer. Math. Soc. 40(12), 837–842 (1934)

    Article  MathSciNet  Google Scholar 

  23. Minh, H.Q., Hofmann, T.: Learning over compact metric spaces. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 239–254. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  24. Nadaraya, È.A.: Nonparametric estimation of probability densities and regression curves. Mathematics and its Applications (Soviet Series), vol. 20. Kluwer Academic Publishers Group, Dordrecht (1989); Translated from the Russian by Samuel Kotz

    Book  MATH  Google Scholar 

  25. Neylon, T.: Sparse solutions for linear prediction problems. PhD thesis, New York University (2006)

    Google Scholar 

  26. Pollard, D.: Convergence of Stochastic Processes. Springer (1984)

    Google Scholar 

  27. Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory 44(5), 1926–1940 (1998)

    Article  MathSciNet  Google Scholar 

  28. Tsybakov, A.B.: Introduction à l’estimation non-paramétrique. Mathématiques & Applications (Berlin), vol. 41. Springer, Berlin (2004)

    MATH  Google Scholar 

  29. Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995)

    Book  MATH  Google Scholar 

  30. von Luxburg, U., Bousquet, O.: Distance-based classification with Lipschitz functions. Journal of Machine Learning Research 5, 669–695 (2004)

    MATH  Google Scholar 

  31. Wasserman, L.: All of nonparametric statistics. Springer Texts in Statistics. Springer, New York (2006)

    MATH  Google Scholar 

  32. Whitney, H.: Analytic extensions of differentiable functions defined in closed sets. Transactions of the American Mathematical Society 36(1), 63–89 (1934)

    Article  MathSciNet  Google Scholar 

  33. Young, N.E.: Sequential and parallel algorithms for mixed packing and covering. In: 42nd Annual IEEE Symposium on Foundations of Computer Science, pp. 538–546 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gottlieb, LA., Kontorovich, A., Krauthgamer, R. (2013). Efficient Regression in Metric Spaces via Approximate Lipschitz Extension. In: Hancock, E., Pelillo, M. (eds) Similarity-Based Pattern Recognition. SIMBAD 2013. Lecture Notes in Computer Science, vol 7953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39140-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39140-8_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39139-2

  • Online ISBN: 978-3-642-39140-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics