Skip to main content

A New Estimator of the Mahalanobis Distance and its Application to Classification Error Rate Estimation

  • Conference paper
  • First Online:
  • 1228 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 639))

Abstract

A well known category of classification error rate estimators is so called parametric error rate estimators. These estimators are typically expressed as functions of the training sample size, the dimensionality of the observation vector and the Mahalanobis distance between the classes. However, all parametric classification error rate estimators are biased and the main source of this bias is the estimate of the Mahalanobis distance. In this paper we propose a new Mahalanobis distance estimation method that is designed for use in parametric classification error rate estimators. Experiments with real world and synthetic data sets show that new estimator helps to reduce the bias of the most common parametric classification error rate estimators. Additionally, non-parametric classification error rate estimators, such as resubstitution, repeated 10-fold cross-validation and leave-one-out are outperformed (in terms of root-mean-square error) by parametric estimators that use new estimates of the Mahalanobis distance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bache, K., Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2015). http://archive.ics.uci.edu/ml

  2. Braga-Neto, U., Dougherty, E.: Is cross-validation valid for small sample microarray classification? Bioinform. 20(3), 374–380 (2004)

    Article  Google Scholar 

  3. Breukelen, M., Duin, R.P.V., Tax, D.M.J., Hartog, J.E.: Handwritten digit recognition by combined classifiers. Kybernetika 34, 381–386 (1998)

    MATH  Google Scholar 

  4. Chen, Y., Wang, H., Zhang, J., Garty, G., Simaan, N., Yao, Y.L., Brenner, D.J.: Automated recognition of robotic manipulation failures in high-throughput biodosimetry tool. Expert Syst. Appl. 39, 9602–9611 (2012)

    Article  Google Scholar 

  5. Dougherty, E., Sima, C., Hua, J., Hanczar, B., Braga-Neto, U.: Performance of error estimators for classification. Curr. Bioinform. 5(1), 53–67 (2010)

    Article  Google Scholar 

  6. Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (2000)

    MATH  Google Scholar 

  7. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97, 77–87 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  8. Fisher, R.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7, 179–188 (1936)

    Article  Google Scholar 

  9. Gvardinskas, M.: Weighted classification error rate estimator for the euclidean distance classifier. In: Dregvaite, G., Damasevicius, R. (eds.) ICIST 2015. CCIS, vol. 538, pp. 343–355. Springer, Heidelberg (2015)

    Google Scholar 

  10. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pp. 1137–1143 (1995)

    Google Scholar 

  11. Lachenbruch, P., Mickey, R.: Estimation of error rates in discriminant analysis. Technometrics 10(1), 1–11 (1968)

    Article  MathSciNet  Google Scholar 

  12. Lucas, D.D., Klein, R., Tannahill, J., Ivanova, D., Brandon, S., Domyancic, D., Zhang, Y.: Failure analysis of parameter-induced simulation crashes in climate models. Geoscientific Model Dev. 6, 1157–1171 (2013)

    Article  Google Scholar 

  13. Raudys, S.: Statistical and Neural Classifiers. An Integrated Approach to Design. Springer, London (2001)

    Book  MATH  Google Scholar 

  14. Raudys, S., Young, D.M.: Results in statistical discriminant analysis: A review of the former soviet union literature. J. Multivar. Anal. 89, 1–35 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Schiavo, R.A., Hand, D.J.: Ten more years of error rate research. Int. Stat. Rev. 68(3), 295–310 (2000)

    Article  MATH  Google Scholar 

  16. Sima, C., Dougherty, E.: Optimal convex error estimators for classification. Pattern Recogn. 39(6), 1763–1780 (2006)

    Article  MATH  Google Scholar 

  17. Smith, C.: Some examples of discrimination. Ann. Eugenics 18, 272–282 (1947)

    MathSciNet  Google Scholar 

  18. Toussaint, G., Sharpe, P.: An efficient method for estimating the probability of misclassification applied to a problem in medical diagnosis. Comput. Biol. Med. 4, 269–278 (1975)

    Article  Google Scholar 

  19. Wyman, F.J., Young, D.M., Turner, D.W.: A comparison of asymptotic error rate expansions for the sample linear discriminant function. Pattern Recogn. 23(7), 775–783 (1990)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mindaugas Gvardinskas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Gvardinskas, M. (2016). A New Estimator of the Mahalanobis Distance and its Application to Classification Error Rate Estimation. In: Dregvaite, G., Damasevicius, R. (eds) Information and Software Technologies. ICIST 2016. Communications in Computer and Information Science, vol 639. Springer, Cham. https://doi.org/10.1007/978-3-319-46254-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46254-7_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46253-0

  • Online ISBN: 978-3-319-46254-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics