An algorithm for intrinsic dimensionality estimation
In this paper a new method for analyzing the intrinsic dimensionality (ID) of low dimensional manifolds in high dimensional feature spaces is presented. The basic idea is to first extract a low-dimensional representation that captures the intrinsic topological structure of the input data and then to analyze this representation, i.e. to estimate the intrinsic dimensionality. Compared to previous approaches based on 1ocal PCA the method has a number of important advantages: First, it can be shown to have only linear time complexity w.r.t. the dimensionality of the input space (in contrast to the cubic complexity of the conventional approach) and hence becomes applicable even for very high dimensional input spaces. Second, it is less sensitive to noise than former approaches, and, finally, the extracted representation can be directly used for further data processing tasks including auto-association and classification.
The presented method for ID estimation is illustrated on a synthetic data set. It has also been successfully applied to ID estimation of full scale image sequences, see [BS97].
KeywordsQuantization Error High Dimensional Feature Space Linear Time Complexity Topology Preservation Local Subspace
Unable to display preview. Download preview PDF.
- [BS97]J. Bruske and G. Sommer. Intrinsic dimensionality estimation with optimally topology preserving maps. Technical Report 9703, Inst. f. Inf. u. Prakt. Math. Christian-Albrechts-Universitaet zu Kiel, 1997. (submitted to IEEE PAMI).Google Scholar
- [FO71]K. Fukunaga and D. R. Olsen. An algorithm for finding intrinsic dimensionality of data. IEEE Transactions on Computers, 20(2):176–183, 1971.Google Scholar
- [JD88]A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentince Hall, 1988.Google Scholar
- [KL94]N. Kambhatla and T.K. Leen. Fast non-linear dimension reduction. In Advances in Neural Information Processing Systems, NIPS 6, pages 152–159, 1994.Google Scholar
- [Kru64]J. B. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29:1–27, 1964.Google Scholar
- [PBJD79]K. Pettis, T. Bailey. T. Jain, and R. Dubes. An intrinsic dimensionality estimator from near-neighbor information. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI, 1:25–37, 1979.Google Scholar
- [PTVF88]W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery. Numerical Recipes in C-The Art of Scientific Computing. Cambridge University Press, 1988.Google Scholar
- [Tru76]G. V. Trunk. Statistical estimation of the intrinsic dimensionality of a noisy signal collection. IEEE Transactions on Computers, 25:165–171, 1976.Google Scholar
- [VDNI94]T. Villmann, R. Der. and T. vlartinetz. A novel aproach to measure the topology preservation of feature maps. ICANN, pages 289–301, 1994.Google Scholar