Linear components of quadratic classifiers

Regular Article
  • 27 Downloads

Abstract

We obtain a decomposition of any quadratic classifier in terms of products of hyperplanes. These hyperplanes can be viewed as relevant linear components of the quadratic rule (with respect to the underlying classification problem). As an application, we introduce the associated multidirectional classifier; a piecewise linear classification rule induced by the approximating products. Such a classifier is useful to determine linear combinations of the predictor variables with ability to discriminate. We also show that this classifier can be used as a tool to reduce the dimension of the data and helps identify the most important variables to classify new elements. Finally, we illustrate with a real data set the use of these linear components to construct oblique classification trees.

Keywords

Supervised classification Fisher linear discriminant analysis Quadratic discriminant analysis Reduction of the dimension Feature extraction Oblique classification trees 

Mathematics Subject Classification

62H30 

Notes

Acknowledgements

The authors are grateful to the reviewers and the associate editor for their insightful comments which have improved the presentation of the paper. We would like to thank Jesús María Arregui (University of the Basque Country) and Jesús Gonzalo (Universidad Autónoma de Madrid) with whom we have shared illuminating conversations on quadratic forms.

References

  1. Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
  2. Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Applications of mathematics (New York), vol 31. Springer, New YorkCrossRefMATHGoogle Scholar
  3. Fan J, Ke ZT, Liu H, Xia L (2015) QUADRO: a supervised dimension reduction method via Rayleigh quotient optimization. Ann Stat 43(4):1498–1534MathSciNetCrossRefMATHGoogle Scholar
  4. Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175MathSciNetCrossRefGoogle Scholar
  5. Golub GH, Van Loan CF (2013) Matrix computations, 4th edn. Johns Hopkins studies in the mathematical sciences. Johns Hopkins University Press, BaltimoreGoogle Scholar
  6. Hand DJ (2006) Classifier technology and the illusion of progress. Stat Sci 21(1):1–34 (with comments and a rejoinder by the author)MathSciNetCrossRefMATHGoogle Scholar
  7. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer series in statistics, 2nd edn. Springer, New YorkMATHGoogle Scholar
  8. Huang H, Liu Y, Marron JS (2012) Bidirectional discrimination with application to data visualization. Biometrika 99(4):851–864MathSciNetCrossRefMATHGoogle Scholar
  9. Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28:1–26CrossRefGoogle Scholar
  10. Park SH, Fürnkranz J (2007) Efficient pairwise classification. In: European conference on machine learning. Springer, pp 658–665Google Scholar
  11. R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org/
  12. Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101–141MathSciNetMATHGoogle Scholar
  13. Ripley B (2014) Tree: classification and regression trees. R package version 1.0-35. http://CRAN.R-project.org/package=tree
  14. Truong A (2009) Fast growing and interpretable oblique trees via logistic regression models. Doctoral dissertation, University of OxfordGoogle Scholar
  15. Wald PW, Kronmal R (1977) Discriminant functions when covariances are unequal and sample sizes are moderate. Biometrics 33:479–484CrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Departamento de MatemáticasUniversidad Autónoma de MadridMadridSpain

Personalised recommendations