Advertisement

Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

A Nonlinear Approach to Dimension Reduction

Abstract

The \(\ell _2\) flattening lemma of Johnson and Lindenstrauss (in: Proceedings of the conference in modern analysis and probability, 1984) is a powerful tool for dimension reduction. It has been conjectured that the target dimension bounds can be refined and bounded in terms of the intrinsic dimensionality of the dataset (for example, the doubling dimension). One such problem was proposed by Lang and Plaut (Geom Dedicata 87(1–3):285–307, 2001) (see also Abraham et al. in: Proceedings of the 20th annual ACM–SIAM symposium on discrete algorithms, 2008; Chan et al. in: J ACM 57(4):1–26, 2010; Gupta et al. in: Proceedings of the 44th annual IEEE symposium on foundations of computer science, 2003; Matoušek in: Open problems on low-distortion embeddings of finite metric spaces, 2002), and is still open. We prove another result in this line of work:

The snowflake metric \(d^\alpha \) (\(\alpha <1\)) of a doubling set \(S\subset \ell _2\) embeds with constant distortion into \(\ell _2^D\) for dimension D that depends solely on the doubling constant of the metric.

In fact, the distortion can be made arbitrarily close to 1, and the target dimension is polylogarithmic in the doubling constant. Our techniques are robust and extend to the more difficult space \(\ell _1\), although the dimension bounds here are quantitatively inferior to those for \(\ell _2\).

This is a preview of subscription content, log in to check access.

Notes

  1. 1.

    Subsequent to the publication of this result in Proceedings of SODA 2011, Bartal and Gottlieb [6] presented a new single-scale embedding for all \(\ell _p\), \(1 \le p < 2\), and derived a snowflake embedding for \(\ell _p\) with only polynomial dependence on the doubling dimension.

  2. 2.

    We suspect that the dimension can be further reduced, since the construction of Theorem 4.3 is an isometry on \(g_C(C\cap N)\) and does not exploit the \(1+\varepsilon \) distortion allowed by requirement (ii). However, an improved map \(\Psi \) cannot be linear, since in the worst case such a linear map requires dimension \(k = 2^{\Omega (|C\cap N|)}\) [10, Corollary 12.A].

References

  1. 1.

    Abraham, I., Bartal, Y., Neiman, O.: Embedding metric spaces in their intrinsic dimension. In: Proceedings of the 19th Annual ACM–SIAM Symposium on Discrete Algorithms, pp. 363–372. SIAM, Philadelphia (2008)

  2. 2.

    Abraham, I., Bartal, Y., Neiman, O.: On low dimensional local embeddings. In: Proceedings of the 20th Annual ACM–SIAM Symposium on Discrete Algorithms, pp. 875–884. SIAM, Philadelphia (2009)

  3. 3.

    Alon, N.: Problems and results in extremal combinatorics I. Discrete Math. 273(1–3), 31–53 (2003)

  4. 4.

    Assouad, P.: Plongements lipschitziens dans \({ R}^{n}\). Bull. Soc. Math. France 111(4), 429–448 (1983)

  5. 5.

    Ball, K.: Isometric embedding in \(l_p\)-spaces. Eur. J. Comb. 11(4), 305–311 (1990)

  6. 6.

    Bartal, Y., Gottlieb, L.: Dimension reduction techniques for \(l_p\,(1 \le p <\infty )\), with applications (2014). Available at arXiv:1408.1789

  7. 7.

    Bartal, Y., Recht, B., Schulman, L.: Dimensionality reduction: beyond the Johnson–Lindenstrauss bound. In: Proceedings of the 22nd Annual ACM–SIAM Symposium on Discrete Algorithms, pp. 868–887. SIAM, Philadelphia (2011). An earlier version was available from the authors’ webpage in 2007 under the title “A Nash-type Dimensionality Reduction for Discrete Subsets of \(L_2\)

  8. 8.

    Chan, H., Gupta, A., Talwar, K.: Ultra-low-dimensional embeddings for doubling metrics. J. ACM 57(4), 1–26 (2010)

  9. 9.

    Deza, M.M., Laurent, M.: Geometry of Cuts and Metrics. Springer, Berlin (1997)

  10. 10.

    Figiel, T., Johnson, W.B., Schechtman, G.: Factorizations of natural embeddings of \(l^n_p\) into \(l_r\). II. Pac. J. Math. 150(2), 261–277 (1991)

  11. 11.

    Gupta, A., Krauthgamer, R., Lee, J.R.: Bounded geometries, fractals, and low-distortion embeddings. In: Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, pp. 534–543. IEEE, Washington (October 2003)

  12. 12.

    Har-Peled, S., Mendel, M.: Fast construction of nets in low-dimensional metrics and their applications. SIAM J. Comput. 35(5), 1148–1184 (2006)

  13. 13.

    Indyk, P., Naor, A.: Nearest-neighbor-preserving embeddings. ACM Trans. Algorithms 3(3), 31 (2007)

  14. 14.

    Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Proceedings of the Conference in Modern Analysis and Probability (New Haven, CT, 1982), pp. 189–206. American Mathematical Society, Providence, RI (1984)

  15. 15.

    Kahane, J.-P.: Hélices et quasi-hélices. In Mathematical Analysis and Applications, Part B, Advances in Mathematics: Supplementary Studies, vol. 7, pp. 417–433. Academic Press, New York (1981)

  16. 16.

    Kirszbraun, M.D.: Über die zusammenziehenden und lipschitzchen transformationen. Fund. Math. 22(134), 77–108 (1934)

  17. 17.

    Krauthgamer, R., Lee, J.R.: Navigating nets: simple algorithms for proximity search. In: Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 791–801. SIAM, Philadelphia (January 2004)

  18. 18.

    Lang, U., Plaut, C.: Bilipschitz embeddings of metric spaces into space forms. Geom. Dedicata 87(1–3), 285–307 (2001)

  19. 19.

    Lee, J.R., Mendel, M., Naor, A.: Metric structures in \(L_1\): dimension, snowflakes, and average distortion. Eur. J. Comb. 26(8), 1180–1190 (2005)

  20. 20.

    Matoušek, J.: On the distortion required for embedding finite metric spaces into normed spaces. Israel J. Math. 93, 333–344 (1996)

  21. 21.

    Matoušek, J.: Open problems on low-distortion embeddings of finite metric spaces 2002. Available at http://kam.mff.cuni.cz/~matousek/metrop.ps (2007). Accessed 29 May 2015

  22. 22.

    Ng, T.S.E., Zhang, H.: Predicting internet network distance with coordinates-based approaches. In: Proceedings of the INFOCOM, vol. 1, pp. 170–179. IEEE, Washington (2002)

  23. 23.

    Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

  24. 24.

    Schechtman, G.: More on embedding subspaces of \(L_p\) in \(l^n_r\). Compositio Math. 61(2), 159–169 (1987)

  25. 25.

    Schoenberg, I.J.: Metric spaces and completely monotone functions. Ann. Math. 39(4), 811–841 (1938)

  26. 26.

    Schoenberg, I.J.: Metric spaces and positive definite functions. Trans. Am. Math. Soc. 44(3), 522–536 (1938)

  27. 27.

    Schulman, L.J.: Clustering for edge-cost minimization (extended abstract). In: Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, pp. 547–555. ACM, New York (2000). Full version available as ECCC report TR99-035

  28. 28.

    Talagrand, M.: Embedding subspaces of \(L_1\) into \(l^N_1\). Proc. Am. Math. Soc. 108(2), 363–369 (1990)

  29. 29.

    Talagrand, M.: Approximating a helix in finitely many dimensions. Ann. Inst. H. Poincaré Probab. Statist. 28(3), 355–363 (1992)

  30. 30.

    Talagrand, M.: Embedding subspaces of \(L_p\) in \(l^N_p\). In Geometric Aspects of Functional Analysis (Israel, 1992–1994), Operator Theory: Advances and Applications, vol. 77, pp. 311–325. Birkhäuser, Basel (1995)

  31. 31.

    Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

Download references

Acknowledgments

The authors thank Assaf Naor and Gideon Schechtman for useful discussions and references, and Yair Bartal for helpful comments on an earlier version of this paper. This work was supported in part by The Israel Science Foundation (Grant #452/08), and by a Minerva grant.

Author information

Correspondence to Lee-Ad Gottlieb.

Additional information

A preliminary version of this paper has appeared in Proceedings of SODA 2011. The current version includes some previously omitted proof material and various minor corrections.

Editor in Charge: Herbert Edelsbrunner

Appendix: Omitted Proof

Appendix: Omitted Proof

Proof of Lemma 3.2

For assertion (i), since \(\mathrm{{e}}^{-t^2/r^2}\) is decreasing in t, \(G_r(t) = r(1-\mathrm{{e}}^{-t^2/r^2})^{1/2}\) is increasing in t. Now consider the function \(F(x) = \frac{1-\mathrm{{e}}^{-x}}{x}\). This function is monotonically decreasing in \(x>0\): The derivative of F(x) is \(\frac{(x+1)\mathrm{{e}}^{-x} - 1}{x^2}\), and it is easily verified that the numerator is negative whenever \(x>0\). It follows that \(G_r(t) = t \sqrt{F(t^2/r^2)}\) is monotonically increasing in r, completing assertion (i). Further

$$\begin{aligned} \frac{G_r(t)}{t} = \sqrt{F(t^2/r^2)} \end{aligned}$$

is monotonically decreasing in t, which proves assertion (ii).

For assertion (iii), recall from assertion (i) that \(G_r(t)\) is monotonically increasing (in t), and thus

$$\begin{aligned} \frac{G_r(t')}{G_r(t)} \le \frac{G_r((1+\eta )t)}{G_r(t)} \le \frac{G((1+\eta )t/r)}{G(t/r)}. \end{aligned}$$

Letting \(s=t/r\), we have

$$\begin{aligned} \frac{G((1+\eta )s)^2}{G(s)^2} - 1&= \frac{G((1+\eta )s)^2-G(s)^2}{G(s)^2} \nonumber \\&= \frac{\mathrm{{e}}^{-s^2} - \mathrm{{e}}^{-(1+\eta )^2 s^2}}{1-\mathrm{{e}}^{-s^2}} \le \frac{\mathrm{{e}}^{-s^2}(1-\mathrm{{e}}^{-3\eta s^2})}{1-\mathrm{{e}}^{-s^2}}. \end{aligned}$$
(6.1)

Recall that by the Taylor series expansion, \(\mathrm{{e}}^{-z} = 1 - z + \frac{z^2}{2} - \frac{z^3}{6} + \ldots \), and so for all \(0\le z\le 1\) we have \( 1-z \le \mathrm{{e}}^{-z} \le 1-z+z^2/2 \le 1-z/2\). Using this estimate, we now have three cases:

  • When \(s^2\le 1\), the right-hand side of (6.1) is at most \(\frac{1\cdot 3\eta s^2}{s^2/2} \le 6\eta \).

  • When \(1\le s^2\le 1/3\eta \), the right-hand side of (6.1) is at most

    $$\begin{aligned} \frac{\mathrm{{e}}^{-s^2}\cdot 3\eta s^2}{1-1/\mathrm{{e}}} \le 6\eta s^2 \mathrm{{e}}^{-s^2}\le \frac{6\eta }{\mathrm{{e}}}, \end{aligned}$$

    where the last inequality follows from the observation that \(z\mapsto z\mathrm{{e}}^{-z}\) is monotonically decreasing for all \(z\ge 1\).

  • When \(s^2\ge 1/3\eta \), the right-hand side of (6.1) is at most

    $$\begin{aligned} \frac{\mathrm{{e}}^{-s^2}\cdot 1}{1-1/e} \le \frac{\mathrm{{e}}^{-s^2}\cdot 3\eta s^2}{1-1/\mathrm{{e}}} \le \frac{6\eta }{\mathrm{{e}}}, \end{aligned}$$

    where the last inequality follows similarly to the previous case.

Altogether, we conclude that

$$\begin{aligned} \frac{G_r(t')}{G_r(t)} \le \frac{G((1+\eta )s)}{G(s)} \le \sqrt{1+6\eta } < 1+3\eta . \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gottlieb, L., Krauthgamer, R. A Nonlinear Approach to Dimension Reduction. Discrete Comput Geom 54, 291–315 (2015). https://doi.org/10.1007/s00454-015-9707-9

Download citation

Keywords

  • Nonlinear embedding
  • Snowflake embedding
  • Doubling dimension
  • Dimension reduction