Skip to main content

Multi-Resolution Geometric Analysis for Data in High Dimensions

  • Chapter
  • First Online:

Part of the book series: Applied and Numerical Harmonic Analysis ((ANHA))

Abstract

Large data sets arise in a wide variety of applications and are often modeled as samples from a probability distribution in high-dimensional space. It is sometimes assumed that the support of such probability distribution is well approximated by a set of low intrinsic dimension, perhaps even a low-dimensional smooth manifold. Samples are often corrupted by high-dimensional noise. We are interested in developing tools for studying the geometry of such high-dimensional data sets. In particular, we present here a multiscale transform that maps high-dimensional data as above to a set of multiscale coefficients that are compressible/sparse under suitable assumptions on the data. We think of this as a geometric counterpart to multi-resolution analysis in wavelet theory: whereas wavelets map a signal (typically low dimensional, such as a one-dimensional time series or a two-dimensional image) to a set of multiscale coefficients, the geometric wavelets discussed here map points in a high-dimensional point cloud to a multiscale set of coefficients. The geometric multi-resolution analysis (GMRA) we construct depends on the support of the probability distribution, and in this sense it fits with the paradigm of dictionary learning or data-adaptive representations, albeit the type of representation we construct is in fact mildly nonlinear, as opposed to standard linear representations. Finally, we apply the transform to a set of synthetic and real-world data sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Available at http://yann.lecun.com/exdb/mnist/

References

  1. Aharon, M., Elad, M., Bruckstein, A.: K-SVD: Design of dictionaries for sparse representation. In: Proceedings of SPARS 05’, pp. 9–12 (2005)

    Google Scholar 

  2. Allard, W.K., Chen, G., Maggioni, M.: Multi-scale geometric methods for data sets II: Geometric multi-resolution analysis. Appl. Computat. Harmonic Analysis 32, 435–462 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  3. Belkin, M., Niyogi, P.: Using manifold structure for partially labelled classification. Advances in NIPS, vol. 15. MIT Press, Cambridge (2003)

    Google Scholar 

  4. Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: ICML, pp. 97–104 (2006)

    Google Scholar 

  5. Binev, P., Cohen, A., Dahmen, W., Devore, R., Temlyakov, V.: Universal algorithms for learning theory part i: Piecewise constant functions. J. Mach. Learn. 6, 1297–1321 (2005)

    MathSciNet  MATH  Google Scholar 

  6. Binev, P., Devore, R.: Fast computation in adaptive tree approximation. Numer. Math. 97, 193–217 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bremer, J., Coifman, R., Maggioni, M.,  Szlam, A.: Diffusion wavelet packets. Appl. Comp. Harm. Anal. 21, 95–112 (2006) (Tech. Rep. YALE/DCS/TR-1304, 2004)

    Google Scholar 

  8. Candès, E., Donoho, D.L.: Curvelets: A surprisingly effective nonadaptive representation of objects with edges. In: Schumaker, L.L., et al. (eds.) Curves and Surfaces. Vanderbilt University Press, Nashville (1999)

    Google Scholar 

  9. Causevic, E.,  Coifman, R.,  Isenhart, R.,  Jacquin, A.,  John, E.,  Maggioni, M.,  Prichep, L.,  Warner, F.: QEEG-based classification with wavelet packets and microstate features for triage applications in the ER, vol. 3. ICASSP Proc., May 2006 10.1109/ICASSP.2006.1660859

    Google Scholar 

  10. Chen, G.,  Little, A.,  Maggioni, M.,  Rosasco, L.: Wavelets and Multiscale Analysis: Theory and Applications. Springer (2011) submitted March 12th, 2010

    Google Scholar 

  11. Chen, G., Maggioni, M.: Multiscale geometric wavelets for the analysis of point clouds. Information Sciences and Systems (CISS), 2010 44th Annual Conference on. IEEE, 2010.

    Google Scholar 

  12. Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)

    Article  MathSciNet  Google Scholar 

  13. Christ, M.: A T(b) theorem with remarks on analytic capacity and the Cauchy integral. Colloq. Math. 60–61, 601–628 (1990)

    MathSciNet  Google Scholar 

  14. Christensen, O.: An introduction to frames and Riesz bases. Applied and Numerical Harmonic Analysis. Birkhäuser, Boston (2003)

    MATH  Google Scholar 

  15. Coifman, R.,  Lafon, S.: Diffusion maps. Appl. Comp. Harm. Anal. 21, 5–30 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Coifman, R.,  Lafon, S.,  Maggioni, M.,  Keller, Y.,  Szlam, A.,  Warner, F.,  Zucker, S.: Geometries of sensor outputs, inference, and information processing. In:  Athale, R.A. (ed.) Proc. SPIE, J. C. Z. E. Intelligent Integrated Microsystems, vol. 6232, p. 623209, May 2006

    Google Scholar 

  17. Coifman, R.,  Maggioni, M.: Diffusion wavelets. Appl. Comp. Harm. Anal. 21, 53–94 (2006) (Tech. Rep. YALE/DCS/TR-1303, Yale Univ., Sep. 2004).

    Google Scholar 

  18. Coifman, R.,  Maggioni, M.: Multiscale data analysis with diffusion wavelets. In: Proc. SIAM Bioinf. Workshop, Minneapolis (2007)

    Google Scholar 

  19. Coifman, R.,  Maggioni, M.: Geometry analysis and signal processing on digital data, emergent structures, and knowledge building. SIAM News, November 2008

    Google Scholar 

  20. Coifman, R.,  Meyer, Y.,  Quake, S., Wickerhauser, M.V.: Signal processing and compression with wavelet packets. In: Progress in Wavelet Analysis and Applications (Toulouse, 1992), pp. 77–93. Frontières, Gif (1993)

    Google Scholar 

  21. Coifman, R.R.,  Lafon, S., Lee, A.B.,  Maggioni, M.,  Nadler, B.,  Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. PNAS 102, 7426–7431 (2005)

    Article  Google Scholar 

  22. Daubechies, I.: Ten lectures on wavelets. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (1992) ISBN: 0-89871-274-2.

    Book  MATH  Google Scholar 

  23. David, G.: Wavelets and singular integrals on curves and surfaces. In: Lecture Notes in Mathematics, vol. 1465. Springer, Berlin (1991)

    Google Scholar 

  24. David, G.: Wavelets and Singular Integrals on Curves and Surfaces. Springer, Berlin (1991)

    Book  Google Scholar 

  25. David, G.,  Semmes, S.: Analysis of and on uniformly rectifiable sets. Mathematical Surveys and Monographs, vol. 38. American Mathematical Society, Providence (1993)

    Google Scholar 

  26. David, G.,  Semmes, S.: Uniform Rectifiability and Quasiminimizing Sets of Arbitrary Codimension. American Mathematical Society, Providence (2000)

    Google Scholar 

  27. Donoho, D.L.,  Grimes, C.: When does isomap recover natural parameterization of families of articulated images? Tech. Rep. 2002–2027, Department of Statistics, Stanford University, August 2002

    Google Scholar 

  28. Donoho, D.L.,  Grimes, C.: Hessian eigenmaps: new locally linear embedding techniques for high-dimensional data. Proc. Nat. Acad. Sciences 100, 5591–5596 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  29. Golub, G., Loan, C.V.: Matrix Computations. Johns Hopkins University Press, Baltimore (1989)

    MATH  Google Scholar 

  30. Jones, P.,  Maggioni, M.,  Schul, R.: Manifold parametrizations by eigenfunctions of the Laplacian and heat kernels. Proc. Nat. Acad. Sci. 105, 1803–1808 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  31. Jones, P.,  Maggioni, M.,  Schul, R.: Universal local manifold parametrizations via heat kernels and eigenfunctions of the Laplacian. Ann. Acad. Scient. Fen. 35, 1–44 (2010) http://arxiv.org/abs/0709.1975

  32. Jones, P.W.: Rectifiable sets and the traveling salesman problem. Invent. Math. 102, 1–15 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  33. Jones, P.W.: The traveling salesman problem and harmonic analysis. Publ. Mat. 35, 259–267 (1991) Conference on Mathematical Analysis (El Escorial, 1989)

    Google Scholar 

  34. Karypis, G.,  Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 359–392 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  35. Little, A., Jung, Y.-M.,  Maggioni, M.: Multiscale estimation of intrinsic dimensionality of data sets. In: Proc. A.A.A.I. (2009)

    Google Scholar 

  36. Little, A.,  Lee, J., Jung, Y.-M.,  Maggioni, M.: Estimation of intrinsic dimensionality of samples from noisy low-dimensional manifolds in high dimensions with multiscale SVD. In: Proc. S.S.P. (2009)

    Google Scholar 

  37. Little, A.,  Maggioni, M.,  Rosasco, L.: Multiscale geometric methods for data sets I: Estimation of intrinsic dimension, submitted (2010)

    Google Scholar 

  38. Maggioni, M., Bremer, J. Jr.,  Coifman, R.,  Szlam, A.: Biorthogonal diffusion wavelets for multiscale representations on manifolds and graphs. SPIE, vol. 5914, p. 59141M (2005)

    Article  Google Scholar 

  39. Maggioni, M.,  Mahadevan, S.: Fast direct policy evaluation using multiscale analysis of markov diffusion processes. In: ICML 2006, pp. 601–608 (2006)

    Article  Google Scholar 

  40. Mahadevan, S.,  Maggioni, M.: Proto-value functions: A spectral framework for solving markov decision processes. JMLR 8, 2169–2231 (2007)

    MathSciNet  MATH  Google Scholar 

  41. Mairal, J.,  Bach, F.,  Ponce, J.,  Sapiro, G.: Online dictionary learning for sparse coding. In: ICML, p. 87 (2009)

    Google Scholar 

  42. Mairal, J.,  Bach, F.,  Ponce, J.,  Sapiro, G.: Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2010)

    MathSciNet  MATH  Google Scholar 

  43. Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Res. 37, 3311–3325 (1997)

    Article  Google Scholar 

  44. Rahman, I.U.,  Drori, I., Stodden, V.C., Donoho, D.L.: Multiscale representations for manifold-valued data. SIAM J. Multiscale Model. Simul. 4, 1201–1232 (2005).

    Article  MathSciNet  MATH  Google Scholar 

  45. Rohrdanz, M.A.,  Zheng, W.,  Maggioni, M.,  Clementi, C.: Determination of reaction coordinates via locally scaled diffusion map. J. Chem. Phys. 134, 124116 (2011)

    Article  Google Scholar 

  46. Roweis, S.,  Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000)

    Article  Google Scholar 

  47. Starck, J.L.,  Elad, M.,  Donoho, D.: Image decomposition via the combination of sparse representations and a variational approach. IEEE T. Image Process. 14, 1570–1582 (2004)

    Article  MathSciNet  Google Scholar 

  48. Szlam, A.: Asymptotic regularity of subdivisions of euclidean domains by iterated PCA and iterated 2-means. Appl. Comp. Harm. Anal. 27, 342–350 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  49. Szlam, A.,  Maggioni, M.,  Coifman, R.,  Bremer, J. Jr.: Diffusion-driven multiscale analysis on manifolds and graphs: top-down and bottom-up constructions. SPIE, vol. 5914(1), p. 59141D (2005)

    Google Scholar 

  50. Szlam, A.,  Maggioni, M.,  Coifman, R.: Regularization on graphs with function-adapted diffusion processes. J. Mach. Learn. Res. 9, 1711–1739 (2008) (YALE/DCS/TR1365, Yale Univ, July 2006)

    Google Scholar 

  51. Szlam, A.,  Sapiro, G.: Discriminative k-metrics. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1009–1016 (2009)

    Google Scholar 

  52. Tenenbaum, J.B., Silva, V.D., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)

    Article  Google Scholar 

  53. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc. B 58, 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  54. Zhang, Z.,  Zha, H.: Principal manifolds and nonlinear dimension reduction via local tangent space alignment. SIAM J. Sci. Comput. 26, 313–338 (2002)

    Article  MathSciNet  Google Scholar 

  55. Zhou, M.,  Chen, H.,  Paisley, J.,  Ren, L.,  Sapiro, G.,  Carin, L.: Non-parametric Bayesian dictionary learning for sparse image representations. In: Neural and Information Processing Systems (NIPS) (2009)

    Google Scholar 

Download references

Acknowledgements

The authors thank E. Monson for useful discussions. AVL was partially supported by NSF and ONR. GC was partially supported by DARPA, ONR, NSF CCF, and NSF/DHS FODAVA program. MM is grateful for partial support from DARPA, NSF, ONR, and the Sloan Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mauro Maggioni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Birkhäuser Boston

About this chapter

Cite this chapter

Chen, G., Little, A.V., Maggioni, M. (2013). Multi-Resolution Geometric Analysis for Data in High Dimensions. In: Andrews, T., Balan, R., Benedetto, J., Czaja, W., Okoudjou, K. (eds) Excursions in Harmonic Analysis, Volume 1. Applied and Numerical Harmonic Analysis. Birkhäuser, Boston. https://doi.org/10.1007/978-0-8176-8376-4_13

Download citation

Publish with us

Policies and ethics