On Kolmogorov asymptotics of estimators of the misclassification error rate in linear discriminant analysis

Zollanvari, Amin; Genton, Marc G.

doi:10.1007/s13171-013-0029-9

On Kolmogorov asymptotics of estimators of the misclassification error rate in linear discriminant analysis

Published: 24 May 2013

Volume 75, pages 300–326, (2013)
Cite this article

Sankhya A Aims and scope Submit manuscript

Amin Zollanvari¹ &
Marc G. Genton²

212 Accesses
1 Citation
Explore all metrics

Abstract

We provide a fundamental theorem that can be used in conjunction with Kolmogorov asymptotic conditions to derive the first moments of well-known estimators of the actual error rate in linear discriminant analysis of a multivariate Gaussian model under the assumption of a common known covariance matrix. The estimators studied in this paper are plug-in and smoothed resubstitution error estimators, both of which have not been studied before under Kolmogorov asymptotic conditions. As a result of this work, we present an optimal smoothing parameter that makes the smoothed resubstitution an unbiased estimator of the true error. For the sake of completeness, we further show how to utilize the presented fundamental theorem to achieve several previously reported results, namely the first moment of the resubstitution estimator and the actual error rate. We provide numerical examples to show the accuracy of the succeeding finite sample approximations in situations where the number of dimensions is comparable or even larger than the sample size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Sander Greenland, Stephen J. Senn, … Douglas G. Altman

Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation

Article 17 October 2016

Meghan K. Cain, Zhiyong Zhang & Ke-Hai Yuan

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Aki Vehtari, Andrew Gelman & Jonah Gabry

References

Anderson, T. (1951). Classification by multivariate analysis. Psychometrika, 16, 31–50.
Article MathSciNet Google Scholar
Anderson, T. (1973). An asymptotic expansion of the distribution of the studentized classification statistic w. Ann. Statist., 1, 964–972.
Article MathSciNet MATH Google Scholar
Conte, E., Lops, M., and Ricci, G. (1996). Adaptive matched filter detection in spherically invariant noise. IEEE Signal Process. Lett., 3, 248–250.
Article Google Scholar
Deev, A. (1970). Representation of statistics of discriminant analysis and asymptotic expansion when space dimensions are comparable with sample size. Dokl. Akad. Nauk SSSR, 195, 759–762 (in Russian).
MathSciNet Google Scholar
Deev, A. (1972). Asymptotic expansions for distributions of statistics w, m, and w* in discriminant analysis. Statist. Methods Class., 31, 6–57 (in Russian).
Google Scholar
Dougherty, E.R. (2008). On the epistemological crisis in genomics. Curr. Genomics, 9, 69–79.
Article Google Scholar
Dougherty, E.R., Hua, J., and Bittner, M. (2007). Validation of computational methods in genomics. Curr. Genomics, 8, 1–19.
Article Google Scholar
Dunn, O.J. (1971). Some expected values for probabilities of correct classification in discriminant analysis. Technometrics, 13, 345–353.
Article MATH Google Scholar
Dupuy, A. and Simon, R. (2008). Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J. Natl. Cancer Inst., 99, 147–157.
Article Google Scholar
Efron, B. (1975). The efficiency of logistic regression compared to normal discriminant analysis. J. Amer. Statist. Assoc., 70, 892–898.
Article MathSciNet MATH Google Scholar
Efron, B. (1980). The distributions of the actual error rates in linear discriminant analysis. J. Amer. Statist. Assoc., 75, 201–205.
Article MathSciNet Google Scholar
Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Ann. Euge., 7, 79–188.
Google Scholar
Fisher, R. (1940). The precision of discriminant function. Ann. Euge., 10, 422–429.
Article Google Scholar
Fisher, R.A. (1925). Statistical methods for research workers, 14th edn. Oliver & Boyd, Edinburgh. The quotation is from the preface to the first (1925) edition.
Google Scholar
Foley, D. (1972). Considerations of sample and feature size. IEEE Trans. Inform. Theory, IT-18, 618–626.
Article Google Scholar
Fujikoshi, Y. (2000). Error bounds for asymptotic approximations of the linear discriminant function when the sample sizes and dimensionality are large. J. Multivariate Anal., 73, 1–17.
Article MathSciNet MATH Google Scholar
Fujikoshi, Y. and Seo, T. (1998). Asymptotic approximations for epmc’s of the linear and the quadratic discriminant functions when the samples sizes and the dimension are large. Statist. Anal. Random Arrays, 6, 269–280.
MathSciNet MATH Google Scholar
Fukunaga, K. and Hayes, R.R. (1989). Estimation of classifier performance. IEEE Trans. Pattern Anal. Mach. Intell., 11, 1087–1101.
Article Google Scholar
Gevaert, O., Smet, F.D., Gorp, T.V., Pochet, N., Engelen, K., Amant, F., Moor, B.D., Timmerman, D. and Vergote, I. (2008). Expression profiling to predict the clinical behaviour of ovarian cancer fails independent evaluation. BMC Cancer, 8, 1–10.
Google Scholar
Glick, N. (1978). Additive estimators for probabilities of correct classification. Pattern Recognit., 10, 211–222.
Article MATH Google Scholar
Hand, D. (1986). Recent advances in error rate estimation. Pattern Recognit. Lett., 4, 335–346.
Article Google Scholar
Hills, M. (1966). Allocation rules and their error rates. J. R. Stat. Soc. Ser. B (Methodological), 28, 1–31.
MathSciNet MATH Google Scholar
Hirst, D. (1996). Error-rate estimation in multiple-group linear discriminant analysis. Technometrics, 38, 389–399.
Article MathSciNet MATH Google Scholar
John, S. (1961). Errors in discrimination. Ann. Math. Stat., 32, 1125–1144.
Article MATH Google Scholar
Johnson, N., Kotz, S. and Balakrishnan, N. (1994). Continuous univariate distributions. John Wiley, New York.
MATH Google Scholar
Kim, S., Dougherty, E.R., Shmulevich, I., Hess, K.R., Hamilton, S.R., Trent, J.M., Fuller, G.N. and Zhang, W. (2002). Identification of combination gene sets for glioma classification. Mol. Cancer Ther., 1, 1229–1236.
Google Scholar
Kittler, J. and Devijver, P. (1982). Statistical properties of error estimators in performance assessment of recognition systems. IEEE Trans. Pattern Anal. Mach. Intell., 4, 215–220.
Article MATH Google Scholar
Lachenbruch, P. and Mickey, M. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10, 1–11.
Article MathSciNet Google Scholar
Martin, J.K. and Hirschberg, D.S. (1996). Small Sample Statistics for Classification Error Rates II: Confidence Intervals and Significance Tests. Tech. Rep. 96-22, University of California, Irvine, CA.
Mclachlan, G.J. (1973). An asymptotic expansion of the expectation of the estimated error rate in discriminant analysis. Aust. J. Statistics, 15, 210–214.
Article MathSciNet MATH Google Scholar
Mclachlan, G.J. (1974). Estimation of the errors of misclassification on the criterion of asymptotic mean square error. Technometrics, 16, 255–260.
Article MathSciNet MATH Google Scholar
Mclachlan, G.J. (1976). The bias of the apparent error in discriminant analysis. Biometrika, 63, 239–244.
Article MathSciNet MATH Google Scholar
Mclachlan, G.J. (1992). Discriminant analysis and statistical pattern recognition. John Wiley, New York.
Book Google Scholar
Meshalkin, L.D. and Serdobolskii, V.I. (1978). Errors in the classification of multi-variate observations. Theory Probab. Appl., 23, 741–750.
Article MathSciNet Google Scholar
Michiels, C.H.S. and Koscielny, S. (2005). Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet, 365, 488–492.
Article Google Scholar
Moran, M. (1975). On the expectation of errors of allocation associated with a linear discriminant function. Biometrika, 62, 141–148.
Article MathSciNet MATH Google Scholar
Moreira, M. (2009). Tests with correct size when instruments can be arbitrarily weak. J. Econometrics, 152, 131–140.
Article MathSciNet Google Scholar
Okamoto, M. (1963). An asymptotic expansion for the distribution of the linear discriminant function. Ann. Math. Stat., 34, 1286–1301 (Correction: Ann. Math. Stat., 39, 1358–1359, 1968).
Article MATH Google Scholar
Raudys, S. (1967). On determining training sample size of a linear classifier. Comput. Syst., 28, 79–87 (in Russian).
Google Scholar
Raudys, S. (1972). On the amount of a priori information in designing the classification algorithm. Tech. Cybern., 4, 168–174 (in Russian).
MathSciNet Google Scholar
Raudys, S. (1978). Comparison of the Estimates of the Probability of Misclassification. In Proc. International Joint Conference on Pattern Recognition, pp 280–282.
Raudys, S. (1998). Expected classification error of the fisher linear classifier with pseudo-inverse covariance matrix. Pattern Recognit. Lett., 19, 385–392.
Article MATH Google Scholar
Raudys, S. and Jain, A.K. (1991). Small sample size effects in statistical pattern recognition: recommendations for practitioners. IEEE Trans. Pattern Anal. Mach. Intell., 13, 252–264.
Article Google Scholar
Raudys, S. and Pikelis, V. (1980). On dimensionality, sample size, classification error, and complexity of classification algorithm in pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell., 2, 242–252.
Article MATH Google Scholar
Raudys, S. and Skurikhina, M. (1995). Small-sample properties of ridge-estimate of the covariance matrix in statistical and neural net classification. Multivariate Statist., 237–245.
Schiavo, R.A. and Hand, D.J. (2000). Ten more years of error rate research. Internat. Statist. Rev., 68, 295–310.
Article MATH Google Scholar
Serdobolskii, V. (2000). Multivariate statistical analysis: a high-dimensional approach. Kluwer Academic Publishers, Netherlands.
Serdobolskii, V.I. (1979). The Moments of Discriminant Function and Classification for a Large Number of Variables. In Statistical Problems of Control (S. Raudys, ed.). Inst. of Math. and Cyb. Press, Vilnius, pp 27–51, in Russian.
Google Scholar
Smith, C. (1947). Some examples of discrimination. Ann. Euge., 18, 272–282.
Google Scholar
Snapinn, S. and Knoke, J. (1985). An evaluation of smoothed classification error-rate estimators. Technometrics, 27, 199–206.
MathSciNet Google Scholar
Snapinn, S. and Knoke, J. (1989). Estimation of error rates in discriminant analysis with selection of variables. Biometrics, 45, 289–299.
Article MathSciNet MATH Google Scholar
Sorum, M.J. (1971). Estimating the conditional probability of misclassification. Technometrics, 13, 333–343.
Article MATH Google Scholar
Sorum, M.J. (1972). Estimating the expected and the optimal probabilities of misclassification. Technometrics, 14, 935–943.
Article MATH Google Scholar
Sorum, M.J. (1973). Estimating the expected probability of misclassification for a rule based on the linear discriminant function: Univariate normal case. Technometrics, 15, 329–339.
Article MathSciNet MATH Google Scholar
Swets, D. and Weng, J. (1996). Using discriminant eigenfeatures for image retrieval. IEEE Trans. Pattern Anal. Mach. Intell., 18, 891–896.
Article Google Scholar
Van Vuuren, S. and Hermansky, H. (1997). Data-driven design of rasta-like filters. In: Proc. Eurospeech, 1607–1610.
Wacker, A. and El-Sheikh, T. (1984). Average classification accuracy over collections of Gaussian problems—common covariance matrix case. Pattern Recognit., 17, 259–274.
Article MathSciNet MATH Google Scholar
Wald, A. (1944). On a statistical problem arising in the classification of an individual into one of two groups. Ann. Math. Stat., 15, 145–162.
Article MathSciNet MATH Google Scholar
Wigner, E.P. (1958). On the distribution of the roots of certain symmetric matrices. Ann. Math., 67, 325–327.
Article MathSciNet MATH Google Scholar
Zollanvari, A., Braga-Neto, U. and Dougherty, E. (2009). On the sampling distribution of resubstitution and leave-one-out error estimators for linear classifiers. Pattern Recognit., 42, 2705–2723.
Article MATH Google Scholar
Zollanvari, A., Braga-Neto, U. and Dougherty, E. (2011). Analytic study of performance of error estimators for linear discriminant analysis. IEEE Trans. Signal Process., 59, 4238–4255.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics and Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA
Amin Zollanvari
CEMSE Division, King Abdullah, University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
Marc G. Genton

Authors

Amin Zollanvari
View author publications
You can also search for this author in PubMed Google Scholar
Marc G. Genton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amin Zollanvari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zollanvari, A., Genton, M.G. On Kolmogorov asymptotics of estimators of the misclassification error rate in linear discriminant analysis. Sankhya A 75, 300–326 (2013). https://doi.org/10.1007/s13171-013-0029-9

Download citation

Received: 16 August 2012
Published: 24 May 2013
Issue Date: August 2013
DOI: https://doi.org/10.1007/s13171-013-0029-9

Keywords and phrases.

AMS (2000) subject classification.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Kolmogorov asymptotics of estimators of the misclassification error rate in linear discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords and phrases.

AMS (2000) subject classification.

Navigation

On Kolmogorov asymptotics of estimators of the misclassification error rate in linear discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords and phrases.

AMS (2000) subject classification.

Search

Navigation