A Diffusion Framework for Dimensionality Reduction
Many fields of research deal with high-dimensional data sets. Hyperspectral images in remote sensing and in hyper-spectral microscopy, transactions in banking monitoring systems are just a few examples for this type of sets. Revealing the geometric structure of these data-sets as a preliminary step facilitates their efficient processing. Often, only a small number of parameters govern the structure of the data-set. This number is the true dimension of the data-set and is the motivation to reduce the dimensionality of the set. Dimensionality reduction algorithms try to discover the true dimension of a data set.
In this chapter, we describe a natural framework based on diffusion processes for the multi-scale analysis of high-dimensional data-sets (Coifman and Lafon, 2006). This scheme enables us to describe the geometric structures of such sets by utilizing the Newtonian paradigm according to which a global description of a system can be derived by the aggregation of local transitions. Specifically, a Markov process is used to describe a random walk on the data set. The spectral properties of the Markov matrix that is associated with this process are used to embed the data-set in a low-dimensional space. This scheme also facilitates the parametrization of a data-set when the high dimensional data-set is not accessible and only a pair-wise similarity matrix is at hand.
KeywordsRandom Walk Markov Process Dimensionality Reduction Usion Process True Dimension
Unable to display preview. Download preview PDF.
- F. R. K. Chung. (1997), Spectral Graph Theory. AMS Regional Conference Series in Mathematics, 92.Google Scholar
- R. I. Kondor and J. D. Lafferty. (2002), Diffusion kernels on graphs and other discrete input spaces. In Proceedings of the 19th International Conference on Machine Learning (ICML 02), pages 315-322.Google Scholar
- S. Lafon Y. Keller and R. R. Coifman. (2006), Data fusion and multi-cue data matching by diffusion maps. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11):1784-1797.Google Scholar
- M. Meila and J. Shi. (2001), A random walk’s view of spectral segmentation. In Proceedings of the International Workshop on Artifical Intelligence and Statistics.Google Scholar
- A. Schclar and A. Averbuch. (2007), Hyper-spectral segmentation via diffusion bases. Technical report, Tel Aviv University.Google Scholar
- S. M. Sheldon. (1983), Stochastic Processes. John Wiley & Sons.Google Scholar
- J. Shi and J. Malik. (2000), Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888-905.Google Scholar
- A. Shtainhart, A. Schclar, and A. Averbuch. (2006), Neuronal tissues sub-nuclei segmentation using multi-contrast mri. Technical report, Tel Aviv University.Google Scholar
- J. Stewart. (2002), Calculus. Brooks Cole, 5th edition.Google Scholar
- Y. Weiss. (1999), Segmentation using eigenvectors: A unifying view. In ICCV (2), pages 975-982.Google Scholar
- S. X. Yu and J. Shi. (2003), Multiclass spectral clustering. In Proceedings of the IEEE International Conference on Computer Vision, pages 313-319.Google Scholar