Skip to main content

Many fields of research deal with high-dimensional data sets. Hyperspectral images in remote sensing and in hyper-spectral microscopy, transactions in banking monitoring systems are just a few examples for this type of sets. Revealing the geometric structure of these data-sets as a preliminary step facilitates their efficient processing. Often, only a small number of parameters govern the structure of the data-set. This number is the true dimension of the data-set and is the motivation to reduce the dimensionality of the set. Dimensionality reduction algorithms try to discover the true dimension of a data set.

In this chapter, we describe a natural framework based on diffusion processes for the multi-scale analysis of high-dimensional data-sets (Coifman and Lafon, 2006). This scheme enables us to describe the geometric structures of such sets by utilizing the Newtonian paradigm according to which a global description of a system can be derived by the aggregation of local transitions. Specifically, a Markov process is used to describe a random walk on the data set. The spectral properties of the Markov matrix that is associated with this process are used to embed the data-set in a low-dimensional space. This scheme also facilitates the parametrization of a data-set when the high dimensional data-set is not accessible and only a pair-wise similarity matrix is at hand.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • M. Belkin and P. Niyogi. (2003), Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373-1396.

    Article  MATH  Google Scholar 

  • F. R. K. Chung. (1997), Spectral Graph Theory. AMS Regional Conference Series in Mathematics, 92.

    Google Scholar 

  • R. R. Coifman, S. Lafon, A. Lee, M. Maggioni, B. Nadler, F. Warner, and S. Zucker. (2005), Geometric diffusions as a tool for harmonics analysis and structure definition of data: Diffusion maps. In Proceedings of the National Academy of Sciences, volume 102, pages 7432-7437.

    Article  Google Scholar 

  • R. R. Coifman and S. Lafon. (2006), Diffusion maps. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21:5-30.

    MATH  MathSciNet  Google Scholar 

  • R. R. Coifman and M. Maggioni (2006) Diffusion wavelets. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21(1):53-94.

    MATH  MathSciNet  Google Scholar 

  • P. Diaconis and D. Stroock. (1991), Geometric bounds for eigenvalues of markov chains. The Annals of Applied Probability, 1(1):36-61.

    Article  MATH  MathSciNet  Google Scholar 

  • C. Fowlkes, S. Belongie, F. Chung, and J. Malik. (2004), Spectral grouping using the nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2):214-225.

    Article  Google Scholar 

  • R. I. Kondor and J. D. Lafferty. (2002), Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the 19th International Conference on Machine Learning (ICML 02), pages 315-322.

    Google Scholar 

  • S. Lafon Y. Keller and R. R. Coifman. (2006), Data fusion and multi-cue data matching by diffusion maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11):1784-1797.

    Google Scholar 

  • S. Lafon and A. Lee. (2006), Diffusion maps and coarse-graining: A unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28 (9):1393-1403.

    Article  Google Scholar 

  • M. Meila and J. Shi. (2001), A random walk’s view of spectral segmentation. In Proceedings of the International Workshop on Artifical Intelligence and Statistics.

    Google Scholar 

  • A. Schclar and A. Averbuch. (2007), Hyper-spectral segmentation via diffusion bases. Technical report, Tel Aviv University.

    Google Scholar 

  • S. M. Sheldon. (1983), Stochastic Processes. John Wiley & Sons.

    Google Scholar 

  • J. Shi and J. Malik. (2000), Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888-905.

    Google Scholar 

  • A. Shtainhart, A. Schclar, and A. Averbuch. (2006), Neuronal tissues sub-nuclei segmentation using multi-contrast mri. Technical report, Tel Aviv University.

    Google Scholar 

  • J. Stewart. (2002), Calculus. Brooks Cole, 5th edition.

    Google Scholar 

  • Y. Weiss. (1999), Segmentation using eigenvectors: A unifying view. In ICCV (2), pages 975-982.

    Google Scholar 

  • S. X. Yu and J. Shi. (2003), Multiclass spectral clustering. In Proceedings of the IEEE International Conference on Computer Vision, pages 313-319.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Schclar, A. (2008). A Diffusion Framework for Dimensionality Reduction. In: Maimon, O., Rokach, L. (eds) Soft Computing for Knowledge Discovery and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69935-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-69935-6_13

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-69934-9

  • Online ISBN: 978-0-387-69935-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics