Efficient Regression in Metric Spaces via Approximate Lipschitz Extension

Gottlieb, Lee-Ad; Kontorovich, Aryeh; Krauthgamer, Robert

doi:10.1007/978-3-642-39140-8_3

Lee-Ad Gottlieb¹⁸,
Aryeh Kontorovich¹⁹ &
Robert Krauthgamer²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7953))

Included in the following conference series:

International Workshop on Similarity-Based Pattern Recognition

1511 Accesses
3 Citations

Abstract

We present a framework for performing efficient regression in general metric spaces. Roughly speaking, our regressor predicts the value at a new point by computing a Lipschitz extension — the smoothest function consistent with the observed data — while performing an optimized structural risk minimization to avoid overfitting. The offline (learning) and online (inference) stages can be solved by convex programming, but this naive approach has runtime complexity O(n ³), which is prohibitive for large datasets. We design instead an algorithm that is fast when the doubling dimension, which measures the “intrinsic” dimensionality of the metric space, is low. We make dual use of the doubling dimension: first, on the statistical front, to bound fat-shattering dimension of the class of Lipschitz functions (and obtain risk bounds); and second, on the computational front, to quickly compute a hypothesis function and a prediction based on Lipschitz extension. Our resulting regressor is both asymptotically strongly consistent and comes with finite-sample risk bounds, while making minimal structural and noise assumptions.

A full version appears as arXiv:1111.4470 [GKK11].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alon, N., Ben-David, S., Cesa-Bianchi, N., Haussler, D.: Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44(4), 615–631 (1997)
Article MathSciNet MATH Google Scholar
Boucheron, S., Bousquet, O., Lugosi, G.: Theory of classification: A survey of recent advances. ESAIM Probab. Statist. 9, 323–375 (2005)
Article MathSciNet MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth Statistics/Probability Series. Wadsworth Advanced Books and Software, Belmont (1984)
MATH Google Scholar
Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: 23rd International Conference on Machine Learning, pp. 97–104. ACM (2006)
Google Scholar
Cole, R., Gottlieb, L.-A.: Searching dynamic point sets in spaces with bounded doubling dimension. In: 38th Annual ACM Symposium on Theory of Computing, pp. 574–583 (2006)
Google Scholar
Clarkson, K.L.: Nearest neighbor queries in metric spaces. Discrete Comput. Geom. 22(1), 63–93 (1999)
Article MathSciNet MATH Google Scholar
Clarkson, K.: Nearest-neighbor searching and metric space dimensions. In: Shakhnarovich, G., Darrell, T., Indyk, P. (eds.) Nearest-Neighbor Methods for Learning and Vision: Theory and Practice, pp. 15–59. MIT Press (2006)
Google Scholar
Devroye, L., Györfi, L., Lugosi, G.: A probabilistic theory of pattern recognition. Applications of Mathematics (New York), vol. 31. Springer, New York (1996)
MATH Google Scholar
Gottlieb, L.-A., Krauthgamer, R.: Proximity algorithms for nearly-doubling spaces. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds.) APPROX 2010. LNCS, vol. 6302, pp. 192–204. Springer, Heidelberg (2010)
Chapter Google Scholar
Gottlieb, L.-A., Kontorovich, L., Krauthgamer, R.: Efficient classification for metric data. In: COLT, pp. 433–440 (2010)
Google Scholar
Gottlieb, L.-A., Kontorovich, A., Krauthgamer, R.: Efficient regression in metric spaces via approximate Lipschitz extension (2011), http://arxiv.org/abs/1111.4470
Gottlieb, L.-A., Kontorovich, A., Krauthgamer, R.: Adaptive metric dimensionality reduction (2013), http://arxiv.org/abs/1302.2752
Györfi, L., Kohler, M., Krzyżak, A., Walk, H.: A distribution-free theory of nonparametric regression. Springer Series in Statistics. Springer, New York (2002)
Book MATH Google Scholar
Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: FOCS, pp. 534–543 (2003)
Google Scholar
Har-Peled, S., Mendel, M.: Fast construction of nets in low-dimensional metrics and their applications. SIAM Journal on Computing 35(5), 1148–1184 (2006)
Article MathSciNet MATH Google Scholar
Kpotufe, S., Dasgupta, S.: A tree-based regressor that adapts to intrinsic dimension. Journal of Computer and System Sciences (2011) (to appear)
Google Scholar
Krauthgamer, R., Lee, J.R.: Navigating nets: Simple algorithms for proximity search. In: 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 791–801 (January 2004), http://dl.acm.org/citation.cfm?id=982792.982913
Kpotufe, S.: Fast, smooth and adaptive regression in metric spaces. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 1024–1032 (2009)
Google Scholar
Kleinberg, J., Slivkins, A., Wexler, T.: Triangulation and embedding using small sets of beacons. J. ACM 56, 32:1–32:37 (2009)
Google Scholar
Lafferty, J., Wasserman, L.: Rodeo: Sparse, greedy nonparametric regression. Ann. Stat. 36(1), 28–63 (2008)
Article MathSciNet MATH Google Scholar
Lugosi, G., Zeger, K.: Nonparametric estimation via empirical risk minimization. IEEE Transactions on Information Theory 41(3), 677–687 (1995)
Article MathSciNet MATH Google Scholar
McShane, E.J.: Extension of range of functions. Bull. Amer. Math. Soc. 40(12), 837–842 (1934)
Article MathSciNet Google Scholar
Minh, H.Q., Hofmann, T.: Learning over compact metric spaces. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 239–254. Springer, Heidelberg (2004)
Chapter Google Scholar
Nadaraya, È.A.: Nonparametric estimation of probability densities and regression curves. Mathematics and its Applications (Soviet Series), vol. 20. Kluwer Academic Publishers Group, Dordrecht (1989); Translated from the Russian by Samuel Kotz
Book MATH Google Scholar
Neylon, T.: Sparse solutions for linear prediction problems. PhD thesis, New York University (2006)
Google Scholar
Pollard, D.: Convergence of Stochastic Processes. Springer (1984)
Google Scholar
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory 44(5), 1926–1940 (1998)
Article MathSciNet Google Scholar
Tsybakov, A.B.: Introduction à l’estimation non-paramétrique. Mathématiques & Applications (Berlin), vol. 41. Springer, Berlin (2004)
MATH Google Scholar
Vapnik, V.N.: The nature of statistical learning theory. Springer, New York (1995)
Book MATH Google Scholar
von Luxburg, U., Bousquet, O.: Distance-based classification with Lipschitz functions. Journal of Machine Learning Research 5, 669–695 (2004)
MATH Google Scholar
Wasserman, L.: All of nonparametric statistics. Springer Texts in Statistics. Springer, New York (2006)
MATH Google Scholar
Whitney, H.: Analytic extensions of differentiable functions defined in closed sets. Transactions of the American Mathematical Society 36(1), 63–89 (1934)
Article MathSciNet Google Scholar
Young, N.E.: Sequential and parallel algorithms for mixed packing and covering. In: 42nd Annual IEEE Symposium on Foundations of Computer Science, pp. 538–546 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Ariel University, Israel
Lee-Ad Gottlieb
Ben-Gurion University of the Negev, Israel
Aryeh Kontorovich
Weizmann Institute of Science, Israel
Robert Krauthgamer

Authors

Lee-Ad Gottlieb
View author publications
You can also search for this author in PubMed Google Scholar
Aryeh Kontorovich
View author publications
You can also search for this author in PubMed Google Scholar
Robert Krauthgamer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of York, Deramore Lane, YO10 5GH, York, UK
Edwin Hancock
DAIS, Università Ca’ Foscari, Via Torino 155, 30172, Venice, Italy
Marcello Pelillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gottlieb, LA., Kontorovich, A., Krauthgamer, R. (2013). Efficient Regression in Metric Spaces via Approximate Lipschitz Extension. In: Hancock, E., Pelillo, M. (eds) Similarity-Based Pattern Recognition. SIMBAD 2013. Lecture Notes in Computer Science, vol 7953. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39140-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-642-39140-8_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39139-2
Online ISBN: 978-3-642-39140-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics