A Diffusion Framework for Dimensionality Reduction

  • Alon Schclar

Many fields of research deal with high-dimensional data sets. Hyperspectral images in remote sensing and in hyper-spectral microscopy, transactions in banking monitoring systems are just a few examples for this type of sets. Revealing the geometric structure of these data-sets as a preliminary step facilitates their efficient processing. Often, only a small number of parameters govern the structure of the data-set. This number is the true dimension of the data-set and is the motivation to reduce the dimensionality of the set. Dimensionality reduction algorithms try to discover the true dimension of a data set.

In this chapter, we describe a natural framework based on diffusion processes for the multi-scale analysis of high-dimensional data-sets (Coifman and Lafon, 2006). This scheme enables us to describe the geometric structures of such sets by utilizing the Newtonian paradigm according to which a global description of a system can be derived by the aggregation of local transitions. Specifically, a Markov process is used to describe a random walk on the data set. The spectral properties of the Markov matrix that is associated with this process are used to embed the data-set in a low-dimensional space. This scheme also facilitates the parametrization of a data-set when the high dimensional data-set is not accessible and only a pair-wise similarity matrix is at hand.


Random Walk Markov Process Dimensionality Reduction Usion Process True Dimension 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. M. Belkin and P. Niyogi. (2003), Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373-1396.MATHCrossRefGoogle Scholar
  2. F. R. K. Chung. (1997), Spectral Graph Theory. AMS Regional Conference Series in Mathematics, 92.Google Scholar
  3. R. R. Coifman, S. Lafon, A. Lee, M. Maggioni, B. Nadler, F. Warner, and S. Zucker. (2005), Geometric diffusions as a tool for harmonics analysis and structure definition of data: Diffusion maps. In Proceedings of the National Academy of Sciences, volume 102, pages 7432-7437.CrossRefGoogle Scholar
  4. R. R. Coifman and S. Lafon. (2006), Diffusion maps. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21:5-30.MATHMathSciNetGoogle Scholar
  5. R. R. Coifman and M. Maggioni (2006) Diffusion wavelets. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21(1):53-94.MATHMathSciNetGoogle Scholar
  6. P. Diaconis and D. Stroock. (1991), Geometric bounds for eigenvalues of markov chains. The Annals of Applied Probability, 1(1):36-61.MATHCrossRefMathSciNetGoogle Scholar
  7. C. Fowlkes, S. Belongie, F. Chung, and J. Malik. (2004), Spectral grouping using the nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2):214-225.CrossRefGoogle Scholar
  8. R. I. Kondor and J. D. Lafferty. (2002), Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the 19th International Conference on Machine Learning (ICML 02), pages 315-322.Google Scholar
  9. S. Lafon Y. Keller and R. R. Coifman. (2006), Data fusion and multi-cue data matching by diffusion maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11):1784-1797.Google Scholar
  10. S. Lafon and A. Lee. (2006), Diffusion maps and coarse-graining: A unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28 (9):1393-1403.CrossRefGoogle Scholar
  11. M. Meila and J. Shi. (2001), A random walk’s view of spectral segmentation. In Proceedings of the International Workshop on Artifical Intelligence and Statistics.Google Scholar
  12. A. Schclar and A. Averbuch. (2007), Hyper-spectral segmentation via diffusion bases. Technical report, Tel Aviv University.Google Scholar
  13. S. M. Sheldon. (1983), Stochastic Processes. John Wiley & Sons.Google Scholar
  14. J. Shi and J. Malik. (2000), Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888-905.Google Scholar
  15. A. Shtainhart, A. Schclar, and A. Averbuch. (2006), Neuronal tissues sub-nuclei segmentation using multi-contrast mri. Technical report, Tel Aviv University.Google Scholar
  16. J. Stewart. (2002), Calculus. Brooks Cole, 5th edition.Google Scholar
  17. Y. Weiss. (1999), Segmentation using eigenvectors: A unifying view. In ICCV (2), pages 975-982.Google Scholar
  18. S. X. Yu and J. Shi. (2003), Multiclass spectral clustering. In Proceedings of the IEEE International Conference on Computer Vision, pages 313-319.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Alon Schclar
    • 1
  1. 1.School of Computer ScienceTel Aviv UniversityIsrael

Personalised recommendations