Heteroscedastic Discriminant Analysis Using R
For purposes of dimensionality reduction in classification Linear Discriminant Analysis (LDA) is probably the most common approach. In fact, LDA is a linear dimension reduction technique that also returns a classification rule. In the case of heteroscedasticity of the classes, Quadratic Discriminant Analysis (QDA) can be used to determine an appropriate classification rule, but QDA does not serve for dimensionality reduction. Sliced Average Variance Estimation (SAVE) has been shown to be adequate in such situations as implemented in R in the package dr. This paper presents an alternative approach for linear dimensionality reduction for situations of heteroscedastic intraclass covariances, namely Heteroscedastic Discriminant Analysis (HDA) as well as its R implementation. Furthermore, tests are suggested in order to determine the dimension for the discriminative data subspace and a generalization of HDA by regularization of the covariance matrix estimates is proposed. Examples for application of HDA in R are demonstrated as well as a small simulation study turning out that HDA is preferable to SAVE in a situation where the classes differ in both means and covariances.
- Burget, L. (2004). Combination of speech features using smoothed heteroscedastic linear discriminant analysis. In Proceedings of Interspeech 2004, Jeju/Korea (pp. 2549–2552).Google Scholar
- Dheeru, D., & Karra Taniskidou, E. (2017). UCI machine learning repository. http://archive.ics.uci.edu/ml.
- Hennig, C. (2018). fpc: Flexible procedures for clustering. R package version 2.1-11.1. http://CRAN.R-project.org/package=fpc.
- Kumar, N. (1997). Investigation of silicon auditory models and generalization of linear discriminant analysis for improved speech recognition.Google Scholar
- Leisch, F., & Dimitriadou, E. (2012). mlbench: Machine learning benchmark problems. R package version 2.1-1. http://CRAN.R-project.org/package=mlbench.
- Mardia, K., Kent, J., & Bibby, J. (1979). Multivariate analysis. Academic.Google Scholar
- Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C.C., et al. (2019). e1071: Misc functions of the department of statistics (e1071). TU Wien, r package version 1.7-0.1. http://CRAN.R-project.org/package=e1071.
- Roever, C., Raabe, N., Luebke, K., Ligges, U., Szepannek, G., & Zentgraf, M. (2018). klaR: Classification and visualization. R package version 0.6-14. http://CRAN.R-project.org/package=klaR.
- Szepannek, G. (2018). hda: Heteroscedastic discriminant analysis. R package version 0.2-14. http://CRAN.R-project.org/package=klaR.
- Szepannek, G., Harczos, T., Klefenz, F., & Weihs, C. (2009). Extending features for automatic speech recognition by means of auditory modelling. In Proceeding of European Speech and Signal Processing Conference (EUSIPCO), Glasgow (pp. 1235–1239).Google Scholar
- Weihs, C., Ligges, U., Luebke, K., & Raabe, N. (2005). klaR—analyzing german business cycles (pp. 225–343). Berlin: Springer.Google Scholar