Adaptive Metric Dimensionality Reduction

Gottlieb, Lee-Ad; Kontorovich, Aryeh; Krauthgamer, Robert

doi:10.1007/978-3-642-40935-6_20

Adaptive Metric Dimensionality Reduction

Lee-Ad Gottlieb²²,
Aryeh Kontorovich²³ &
Robert Krauthgamer²⁴

Conference paper

1563 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8139))

Abstract

We study data-adaptive dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are doubling, or nearly doubling, which yields a new theoretical explanation for empirically reported improvements gained by preprocessing Euclidean data by PCA (Principal Components Analysis) prior to constructing a linear classifier. On the algorithmic front, we describe an analogue of PCA for metric spaces, namely an efficient procedure that approximates the data’s intrinsic dimension, which is often much lower than the ambient dimension. Our approach thus leverages the dual benefits of low dimensionality: (1) more efficient algorithms, e.g., for proximity search, and (2) more optimistic generalization bounds.

A full version, including proofs omitted here, is available at [12].

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Andoni, A., Krauthgamer, R.: The computational hardness of estimating edit distance. SIAM J. Comput. 39(6), 2398–2429 (2010)
Article MathSciNet MATH Google Scholar
Balcan, M.F., Blum, A., Vempala, S.: Kernels as features: On kernels, margins, and low-dimensional mappings. Mach. Learn. 65(1), 79–94 (2006)
Article Google Scholar
Bartlett, P.L., Mendelson, S.: Rademacher and gaussian complexities: Risk bounds and structural results. JMLR 3, 463–482 (2002)
MathSciNet Google Scholar
Bi, J., Bennett, K.P., Embrechts, M.J., Breneman, C.M., Song, M.: Dimensionality reduction via sparse support vector machines. JMLR 3, 1229–1243 (2003)
MATH Google Scholar
Blanchard, G., Zwald, L.: Finite-dimensional projection for classification and statistical learning. IEEE Trans. Inform. Theory 54(9), 4169–4182 (2008), http://dx.doi.org/10.1109/TIT.2008.926312
Article MathSciNet Google Scholar
Blum, A.: Random projection, margins, kernels, and feature-selection. In: Saunders, C., Grobelnik, M., Gunn, S., Shawe-Taylor, J. (eds.) SLSFS 2005. LNCS, vol. 3940, pp. 52–68. Springer, Heidelberg (2006)
Chapter Google Scholar
Burges, C.J.C.: Dimension reduction: A guided tour. Foundations and Trends in Machine Learning 2(4) (2010)
Google Scholar
Der, R., Lee, D.: Large-Margin Classification in Banach Spaces. In: AISTATS 2007, pp. 91–98 (2007)
Google Scholar
Enflo, P.: On the nonexistence of uniform homeomorphisms between L _p-spaces. Ark. Mat. 8, 103–105 (1969)
Article MathSciNet Google Scholar
Fukumizu, K., Bach, F.R., Jordan, M.I.: Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. JMLR 5, 73–99 (2004)
MathSciNet MATH Google Scholar
Golub, G.H., Van Loan, C.F.: Matrix computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
MATH Google Scholar
Gottlieb, L.A., Kontorovich, A., Krauthgamer, R.: Adaptive metric dimensionality reduction (2013), http://arxiv.org/abs/1302.2752
Gottlieb, L.A., Kontorovich, L., Krauthgamer, R.: Efficient classification for metric data. In: COLT, pp. 433–440 (2010)
Google Scholar
Gottlieb, L.A., Krauthgamer, R.: Proximity algorithms for nearly-doubling spaces. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds.) APPROX and RANDOM 2010. LNCS, vol. 6302, pp. 192–204. Springer, Heidelberg (2010)
Chapter Google Scholar
Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: FOCS, pp. 534–543 (2003)
Google Scholar
Hein, M., Bousquet, O., Schölkopf, B.: Maximal margin classification for metric spaces. J. Comput. Syst. Sci. 71(3), 333–359 (2005)
Article MATH Google Scholar
Huang, K., Aviyente, S.: Large margin dimension reduction for sparse image classification. In: SSP, pp. 773–777 (2007)
Google Scholar
Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30(1), 1–50 (2002)
MathSciNet MATH Google Scholar
Kpotufe, S., Dasgupta, S.: A tree-based regressor that adapts to intrinsic dimension. J. Comput. Syst. Sci. 78(5), 1496–1515 (2012), http://dx.doi.org/10.1016/j.jcss.2012.01.002
Article MathSciNet MATH Google Scholar
Ledoux, M., Talagrand, M.: Probability in Banach Spaces. Springer (1991)
Google Scholar
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Information Science and Statistics. Springer (2007)
Google Scholar
von Luxburg, U., Bousquet, O.: Distance-based classification with lipschitz functions. Journal of Machine Learning Research 5, 669–695 (2004)
MATH Google Scholar
Micchelli, C.A., Pontil, M.: A function representation for learning in banach spaces. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 255–269. Springer, Heidelberg (2004)
Chapter Google Scholar
Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press (2012)
Google Scholar
Naor, A., Schechtman, G.: Planar earthmover is not in l ₁. SIAM J. Comput. 37, 804–826 (2007)
Article MathSciNet MATH Google Scholar
Paul, S., Boutsidis, C., Magdon-Ismail, M., Drineas, P.: Random projections for support vector machines. CoRR abs/1211.6085 (2012)
Google Scholar
Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: NIPS (2007)
Google Scholar
Sabato, S., Srebro, N., Tishby, N.: Tight sample complexity of large-margin learning. In: NIPS, pp. 2038–2046 (2010)
Google Scholar
Schölkopf, B., Shawe-Taylor, J., Smola, A., Williamson, R.: Kernel-dependent support vector error bounds. In: ICANN (1999)
Google Scholar
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory 44(5), 1926–1940 (1998)
Article MathSciNet MATH Google Scholar
Varshney, K.R., Willsky, A.S.: Linear dimensionality reduction for margin-based classification: High-dimensional data and sensor networks. IEEE Transactions on Signal Processing 59(6), 2496–2512 (2011)
Article MathSciNet Google Scholar
Young, N.E.: Sequential and parallel algorithms for mixed packing and covering. In: FOCS, pp. 538–546 (2001)
Google Scholar
Zhang, H., Xu, Y., Zhang, J.: Reproducing kernel banach spaces for machine learning. J. Mach. Learn. Res. 10, 2741–2775 (2009)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Ariel University, Ariel, Israel
Lee-Ad Gottlieb
Ben-Gurion University of the Negev, Beer Sheva, Israel
Aryeh Kontorovich
Weizmann Institute of Science, Rehovot, Israel
Robert Krauthgamer

Authors

Lee-Ad Gottlieb
View author publications
You can also search for this author in PubMed Google Scholar
Aryeh Kontorovich
View author publications
You can also search for this author in PubMed Google Scholar
Robert Krauthgamer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National University of Singapore, Republic of Singapore
Sanjay Jain & Frank Stephan &
Inria Lille - Nord Europe, Villeneuve d’Ascq, France
Rémi Munos
Hokkaido University, Sapporo, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gottlieb, LA., Kontorovich, A., Krauthgamer, R. (2013). Adaptive Metric Dimensionality Reduction. In: Jain, S., Munos, R., Stephan, F., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2013. Lecture Notes in Computer Science(), vol 8139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40935-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-40935-6_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40934-9
Online ISBN: 978-3-642-40935-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics