On the Generalization of the Mahalanobis Distance

  • Gabriel Martos
  • Alberto Muñoz
  • Javier González
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8258)

Abstract

The Mahalanobis distance (MD) is a widely used measure in Statistics and Pattern Recognition. Interestingly, assuming that the data are generated from a Gaussian distribution, it considers the covariance matrix to evaluate the distance between a data point and the distribution mean. In this work, we generalize MD for distributions in the exponential family, providing both, a definition in terms of the data density function and a computable version. We show its performance on several artificial and real data scenarios.

Keywords

Mahalanobis Distance Outlier Detection Texture Image Exponential Family Bregman Divergence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Banerjee, A., Merugu, S., Dhillon, I., Ghosh, J.: Clustering with Bregman Divergences. Journal of Machine Learning Research, 1705–1749 (2005)Google Scholar
  2. 2.
    Filzmoser, P., Maronna, R.A., Werner, M.: Outlier identification in high dimensions. Computational Statistics & Data Analysis 52(3), 1694–1711 (2008)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Forster, J., Warmuth, M.K.: Relative Expected Instantaneous Loss Bounds. In: Annual Conference on Computational Learning Theory, pp. 90–99 (2000)Google Scholar
  4. 4.
    Kylberg, G.: The Kylberg Texture Dataset v. 1.0. In: Centre for Image Analysis. Swedish University of Agricultural Sciences and Uppsala University, Uppsala, Sweden, http://www.cb.uu.se/
  5. 5.
    Mahalanobis, P.C.: On the generalised distance in statistics. In: Proceedings of the National Institute of Sciences of India, pp. 49–55 (1936)Google Scholar
  6. 6.
    Mallat, S.: A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 11(7), 674–693Google Scholar
  7. 7.
    Muñoz, A., Moguerza, J.M.: Estimation of High-Density Regions using One-Class Neighbor Machines. IEEE Trans. on Pattern Analysis and Machine Intelligence 28(3), 476–480Google Scholar
  8. 8.
    Muñoz, A., Moguerza, J.M.: A Naive Solution to the One-Class Problem and Its Extension to Kernel Methods. In: Sanfeliu, A., Cortés, M.L. (eds.) CIARP 2005. LNCS, vol. 3773, pp. 193–204. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Smith, R.L.: Extreme value theory. In: Ledermann, W. (ed.) Handbook of Applied Mathematics, vol. 7, pp. 437–471 (1990)Google Scholar
  10. 10.
    Zimek, A., Schubert, E., Kriegel, H.P.: A survey on unsupervised outlier detection in high-dimensional numerical data. Statistical Analysis and Data Mining 5(5), 363–387 (2012)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Gabriel Martos
    • 1
  • Alberto Muñoz
    • 1
  • Javier González
    • 2
  1. 1.Department of StatisticsUniversity Carlos IIIMadridSpain
  2. 2.J. Bernoulli Institute for Mathematics and Computer ScienceUniversity of GroningenThe Netherlands

Personalised recommendations