Advertisement

Heteroscedastic Discriminant Analysis Using R

  • Gero SzepannekEmail author
  • Uwe Ligges
Chapter
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

For purposes of dimensionality reduction in classification Linear Discriminant Analysis (LDA) is probably the most common approach. In fact, LDA is a linear dimension reduction technique that also returns a classification rule. In the case of heteroscedasticity of the classes, Quadratic Discriminant Analysis (QDA) can be used to determine an appropriate classification rule, but QDA does not serve for dimensionality reduction. Sliced Average Variance Estimation (SAVE) has been shown to be adequate in such situations as implemented in R in the package dr. This paper presents an alternative approach for linear dimensionality reduction for situations of heteroscedastic intraclass covariances, namely Heteroscedastic Discriminant Analysis (HDA) as well as its R implementation. Furthermore, tests are suggested in order to determine the dimension for the discriminative data subspace and a generalization of HDA by regularization of the covariance matrix estimates is proposed. Examples for application of HDA in R are demonstrated as well as a small simulation study turning out that HDA is preferable to SAVE in a situation where the classes differ in both means and covariances.

References

  1. Burget, L. (2004). Combination of speech features using smoothed heteroscedastic linear discriminant analysis. In Proceedings of Interspeech 2004, Jeju/Korea (pp. 2549–2552).Google Scholar
  2. Cook, R. D., & Weisberg, S. (1991). Comment on Li. Journal of the American Statistical Association, 86, 328–332.zbMATHGoogle Scholar
  3. Dheeru, D., & Karra Taniskidou, E. (2017). UCI machine learning repository. http://archive.ics.uci.edu/ml.
  4. Di Pillo, P. (1976). The application of bias to discriminant analysis. Communications in Statistics - Theory and Methods, 5(9), 843–854.MathSciNetCrossRefGoogle Scholar
  5. Fisher, R. A. (1936). The use of multiple measures in taxonomic problems. Annals of Eugenics, 7, 179–188.CrossRefGoogle Scholar
  6. Friedman, J. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165–175.MathSciNetCrossRefGoogle Scholar
  7. Guyon, I., & Elysseeff, A. (2003). An introduction to variable selection and feature selection. Journal of Machine Learning Research, 3, 1157–1182.zbMATHGoogle Scholar
  8. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. New York: Springer.CrossRefGoogle Scholar
  9. Hennig, C. (2004). Asymmetric linear dimension reduction for classification. Journal of Computational and Graphical Statistics, 13, 930–945.MathSciNetCrossRefGoogle Scholar
  10. Hennig, C. (2018). fpc: Flexible procedures for clustering. R package version 2.1-11.1. http://CRAN.R-project.org/package=fpc.
  11. Kumar, N. (1997). Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition.Google Scholar
  12. Kumar, N., & Andreou, A. (1998). Heteroscedastic discriminant analysis and reduced rank hmms for improved speech recognition. Speech Communication, 25(4), 283–297.CrossRefGoogle Scholar
  13. Leisch, F., & Dimitriadou, E. (2012). mlbench: Machine learning benchmark problems. R package version 2.1-1. http://CRAN.R-project.org/package=mlbench.
  14. Li, K. C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86, 316–327.MathSciNetCrossRefGoogle Scholar
  15. Mardia, K., Kent, J., & Bibby, J. (1979). Multivariate analysis. Academic.Google Scholar
  16. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C.C., et al. (2019). e1071: Misc functions of the department of statistics (e1071). TU Wien, r package version 1.7-0.1. http://CRAN.R-project.org/package=e1071.
  17. Pardoe, I., Yin, X., & Cook, R. D. (2006). Graphical tools for quadratic discriminant analysis. Technometrics, 49(2), 172–183.MathSciNetCrossRefGoogle Scholar
  18. Roever, C., Raabe, N., Luebke, K., Ligges, U., Szepannek, G., & Zentgraf, M. (2018). klaR: Classification and visualization. R package version 0.6-14. http://CRAN.R-project.org/package=klaR.
  19. Schott, J. (1993). Dimension reduction in quadratic discriminant analysi. Computational Statistics and Data Analysis, 16, 161–174.MathSciNetCrossRefGoogle Scholar
  20. Szepannek, G. (2018). hda: Heteroscedastic discriminant analysis. R package version 0.2-14. http://CRAN.R-project.org/package=klaR.
  21. Szepannek, G., Harczos, T., Klefenz, F., & Weihs, C. (2009). Extending features for automatic speech recognition by means of auditory modelling. In Proceeding of European Speech and Signal Processing Conference (EUSIPCO), Glasgow (pp. 1235–1239).Google Scholar
  22. Weihs, C., Ligges, U., Luebke, K., & Raabe, N. (2005). klaRanalyzing german business cycles (pp. 225–343). Berlin: Springer.Google Scholar
  23. Weisberg, S. (2002). Dimensionality reduction regression in R. Journal of Statistical Software, 7(1), 1–22.CrossRefGoogle Scholar
  24. Young, D., Marco, V., & Odell, P. (1987). Quadratic discrimination: Some results on optimal low-dimensional representation. Journal of Statistical Planning and Inference, 17, 307–319.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Stralsund University of Applied SciencesStralsundGermany
  2. 2.Department of StatisticsTU Dortmund UniversityDortmundGermany

Personalised recommendations