Advertisement

A Novel Intrinsic Dimensionality Estimator Based on Rank-Order Statistics

  • S. Bassis
  • A. RozzaEmail author
  • C. Ceruti
  • G. Lombardi
  • E. Casiraghi
  • P. Campadelli
Conference paper
  • 862 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7627)

Abstract

In the past two decades the estimation of the intrinsic dimensionality of a dataset has gained considerable importance, since it is a relevant information for several real life applications. Unfortunately, although a great deal of research effort has been devoted to the development of effective intrinsic dimensionality estimators, the problem is still open. For this reason, in this paper we propose a novel robust intrinsic dimensionality estimator that exploits the information conveyed by the normalized nearest neighbor distances, through a technique based on rank-order statistics that limits common underestimation issues related to the edge effect. Experiments performed on both synthetic and real datasets highlight the robustness and the effectiveness of the proposed algorithm when compared to state-of-the-art methodologies.

Keywords

Intrinsic dimensionality estimation Manifold learning Rank-order statistics 

Supplementary material

References

  1. 1.
    Bishop, C.M.: Bayesian PCA. In: Proceedings of NIPS 11, pp. 382–388 (1998)Google Scholar
  2. 2.
    Camastra, F., Filippone, M.: A comparative evaluation of nonlinear dynamics methods for time series prediction. Neural Comput. Appl. 18(8), 1021–1029 (2009)CrossRefGoogle Scholar
  3. 3.
    Camastra, F., Vinciarelli, A.: Estimating the intrinsic dimension of data with a fractal-based method. IEEE Trans. PAMI 24, 1404–1407 (2002)CrossRefGoogle Scholar
  4. 4.
    Carter, K.M., Hero, A.O., Raich, R.: De-biasing for intrinsic dimension estimation. In: IEEE/SP 14th Workshop on Statistical Signal Processing, SSP 2007, pp. 601–605, Aug 2007Google Scholar
  5. 5.
    Ceruti, C., Bassis, S., Rozza, A., Lombardi, G., Casiraghi, E., Campadelli, P.: Danco: an intrinsic dimensionalty estimator exploiting angle and norm concentration. Elsevier, Pattern Recogn. 47(8), 2569–2581 (2014)CrossRefzbMATHGoogle Scholar
  6. 6.
    Chua, L., Komuro, M., Matsumoto, T.: The double scroll. IEEE Trans. Circuits Syst. 32, 797–818 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Costa, J.A., Hero, A.O.: Geodesic entropic graphs for dimension and entropy estimation in manifold learning. IEEE Trans. Sig. Process. 52(8), 2210–2221 (2004)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Costa, J.A., Hero, A.O.: Learning intrinsic dimension and entropy of high-dimensional shape spaces. In: Proceedings of EUSIPCO (2004)Google Scholar
  9. 9.
    Costa, J.A., Hero, A.O.: Learning intrinsic dimension and entropy of shapes. In: Statistics and Analysis of Shapes, Birkhauser (2005)Google Scholar
  10. 10.
    Friedman, J.H., Hastie, T., Tibshirani, R.: The Elements of Statistical Learning - Data Mining, Inference and Prediction. Springer, Berlin (2009)zbMATHGoogle Scholar
  11. 11.
    Fukunaga, K.: Intrinsic dimensionality extraction. In: Krishnaiah, P.R., Kanal, L.N. (eds.) Classification, Pattern Recognition and Reduction of Dimensionality. North Holland, Amsterdam (1982)Google Scholar
  12. 12.
    Fukunaga, K., Olsen, D.R.: An algorithm for finding intrinsic dimensionality of data. IEEE Trans. Comput. 20, 176–183 (1971)CrossRefzbMATHGoogle Scholar
  13. 13.
    Grassberger, P., Procaccia, I.: Measuring the strangeness of strange attractors. Physica D: Nonlinear Phenom. 9, 189–208 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Guan, Y., Dy, J.G.: Sparse probabilistic principal component analysis. J. Mach. Learn. Res. - Proc. Track 5, 185–192 (2009)Google Scholar
  15. 15.
    Hein, M.: Intrinsic dimensionality estimation of submanifolds in euclidean space. In: Proceedings of ICML, pp. 289–296 (2005)Google Scholar
  16. 16.
    Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 2. Wiley, New York (1995)zbMATHGoogle Scholar
  17. 17.
    Jollife, I.T.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton (1961)Google Scholar
  18. 18.
    Jollife, I.T.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (1986) CrossRefGoogle Scholar
  19. 19.
    Kirby, M.: Geometric Data Analysis: an Empirical Approach to Dimensionality Reduction and the Study of Patterns. Wiley, New York (1998)Google Scholar
  20. 20.
    Kotz, S., Kozubowski, T.J., Podgórski, K.: Maximum likelihood estimation of asymmetric laplace parameters. Ann. Inst. Stat. Math. 54(4), 816–826 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)CrossRefGoogle Scholar
  22. 22.
    Levina, E., Bickel, P.J.: Maximum likelihood estimation of intrinsic dimension. In: Proceedings of NIPS 17(1), pp. 777–784 (2005)Google Scholar
  23. 23.
    Li, J., Tao, D.: Simple exponential family PCA. In: Proceedings of AISTATS, pp. 453–460 (2010)Google Scholar
  24. 24.
    Lombardi, G., Rozza, A., Ceruti, C., Casiraghi, E., Campadelli, P.: Minimum neighbor distance estimators of intrinsic dimension. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part II. LNCS, vol. 6912, pp. 374–389. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  25. 25.
    Ott, E.: Chaos in Dynamical Systems. Cambridge University Press, Cambridge (1993)zbMATHGoogle Scholar
  26. 26.
    Pineda, F.J., Sommerer, J.C.: Estimating generalized dimensions and choosing time delays: A fast algorithm. In: Time Series Prediction. Forecasting the Future and Understanding the Past, pp. 367–385 (1994)Google Scholar
  27. 27.
    Rozza, A., Lombardi, G., Ceruti, C., Casiraghi, E., Campadelli, P.: Novel high intrinsic dimensionality estimators. Mach. Learn. J. 89(1–2), 37–65 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Rozza, A., Lombardi, G., Rosa, M., Casiraghi, E., Campadelli, P.: IDEA: intrinsic dimension estimation algorithm. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011, Part I. LNCS, vol. 6978, pp. 433–442. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  29. 29.
    Tenenbaum, J., Silva, V., Langford, J.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000)CrossRefGoogle Scholar
  30. 30.
    Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. J. Royal Stat. Soc., Ser. B 61(Pt. 3), 611–622 (1997)MathSciNetzbMATHGoogle Scholar
  31. 31.
    Van der Maaten, L.J.P.: An introduction to dimensionality reduction using matlab. Technical report, Delft University of Technology (2007)Google Scholar
  32. 32.
    Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)zbMATHGoogle Scholar
  33. 33.
    Verveer, P.J., Duin, R.P.W.: An evaluation of intrinsic dimensionality estimators. IEEE Trans. PAMI 17, 81–86 (1995)CrossRefGoogle Scholar
  34. 34.
    Wilks, S.S.: Mathematical Statistics. Wiley Publications in Statistics. John Wiley, New York (1962)zbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  • S. Bassis
    • 1
  • A. Rozza
    • 2
    Email author
  • C. Ceruti
    • 1
  • G. Lombardi
    • 1
  • E. Casiraghi
    • 1
  • P. Campadelli
    • 1
  1. 1.Dipartimento di InformaticaUniversità degli Studi di MilanoMilanoItaly
  2. 2.Research TeamHyera SoftwareCoccaglio (BS)Italy

Personalised recommendations