Many fields of research deal with high-dimensional data sets. Hyperspectral images in remote sensing and in hyper-spectral microscopy, transactions in banking monitoring systems are just a few examples for this type of sets. Revealing the geometric structure of these data-sets as a preliminary step facilitates their efficient processing. Often, only a small number of parameters govern the structure of the data-set. This number is the true dimension of the data-set and is the motivation to reduce the dimensionality of the set. Dimensionality reduction algorithms try to discover the true dimension of a data set.
In this chapter, we describe a natural framework based on diffusion processes for the multi-scale analysis of high-dimensional data-sets (Coifman and Lafon, 2006). This scheme enables us to describe the geometric structures of such sets by utilizing the Newtonian paradigm according to which a global description of a system can be derived by the aggregation of local transitions. Specifically, a Markov process is used to describe a random walk on the data set. The spectral properties of the Markov matrix that is associated with this process are used to embed the data-set in a low-dimensional space. This scheme also facilitates the parametrization of a data-set when the high dimensional data-set is not accessible and only a pair-wise similarity matrix is at hand.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Belkin and P. Niyogi. (2003), Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373-1396.
F. R. K. Chung. (1997), Spectral Graph Theory. AMS Regional Conference Series in Mathematics, 92.
R. R. Coifman, S. Lafon, A. Lee, M. Maggioni, B. Nadler, F. Warner, and S. Zucker. (2005), Geometric diffusions as a tool for harmonics analysis and structure definition of data: Diffusion maps. In Proceedings of the National Academy of Sciences, volume 102, pages 7432-7437.
R. R. Coifman and S. Lafon. (2006), Diffusion maps. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21:5-30.
R. R. Coifman and M. Maggioni (2006) Diffusion wavelets. Applied and Computational Harmonic Analysis: special issue on Diffusion Maps and Wavelets, 21(1):53-94.
P. Diaconis and D. Stroock. (1991), Geometric bounds for eigenvalues of markov chains. The Annals of Applied Probability, 1(1):36-61.
C. Fowlkes, S. Belongie, F. Chung, and J. Malik. (2004), Spectral grouping using the nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2):214-225.
R. I. Kondor and J. D. Lafferty. (2002), Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the 19th International Conference on Machine Learning (ICML 02), pages 315-322.
S. Lafon Y. Keller and R. R. Coifman. (2006), Data fusion and multi-cue data matching by diffusion maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11):1784-1797.
S. Lafon and A. Lee. (2006), Diffusion maps and coarse-graining: A unified framework for dimensionality reduction, graph partitioning, and data set parameterization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28 (9):1393-1403.
M. Meila and J. Shi. (2001), A random walk’s view of spectral segmentation. In Proceedings of the International Workshop on Artifical Intelligence and Statistics.
A. Schclar and A. Averbuch. (2007), Hyper-spectral segmentation via diffusion bases. Technical report, Tel Aviv University.
S. M. Sheldon. (1983), Stochastic Processes. John Wiley & Sons.
J. Shi and J. Malik. (2000), Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888-905.
A. Shtainhart, A. Schclar, and A. Averbuch. (2006), Neuronal tissues sub-nuclei segmentation using multi-contrast mri. Technical report, Tel Aviv University.
J. Stewart. (2002), Calculus. Brooks Cole, 5th edition.
Y. Weiss. (1999), Segmentation using eigenvectors: A unifying view. In ICCV (2), pages 975-982.
S. X. Yu and J. Shi. (2003), Multiclass spectral clustering. In Proceedings of the IEEE International Conference on Computer Vision, pages 313-319.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Schclar, A. (2008). A Diffusion Framework for Dimensionality Reduction. In: Maimon, O., Rokach, L. (eds) Soft Computing for Knowledge Discovery and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-69935-6_13
Download citation
DOI: https://doi.org/10.1007/978-0-387-69935-6_13
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-69934-9
Online ISBN: 978-0-387-69935-6
eBook Packages: Computer ScienceComputer Science (R0)