Efficient matrixized classification learning with separated solution process

Abstract

The matrix-pattern-oriented Ho–Kashyap classifier (MatMHKS), using two-sided weight vectors to constrain the matrixized samples, can deal with not only the vectorized sample but also the matrixized sample. For vectorized sample, by converting the vectorized mode into matrixized mode, MatMHKS relieves the curse of dimensionality and extends the expressive modes of sample. Although MatMHKS has been demonstrated to be effective in the classification performance, it consumes a lot of time to alternately update two weight vectors in each iteration. Moreover, MatMHKS is not suitable in dealing with imbalanced problems. Finally, there does not exist effective analysis of generalization risk for matrixized classifiers. To this end, this paper proposes an efficient matrixized Ho–Kashyap classifier (EMatMHKS), which separately updates the two-sided weight vectors to avoid repeatedly calculating the inverse matrix in MatMHKS, thus significantly improving the training speed. Moreover, by introducing a weight matrix, both balanced and imbalanced situations can be tackled. Finally, PAC-Bayes bound is used to reflect the error upper bound of matrixized and vectorized classifiers. Both balanced and imbalanced data sets are used to validate the effectiveness and the efficiency of the proposed EMatMHKS in the experiment.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. 1.

    http://www.keel.es/.

References

  1. 1.

    Bessa MA, Bostanabad R, Liu Z, Hu A, Apley Daniel W, Brinson C, Chen W, Liu Wing Kam (2017) A framework for data-driven analysis of materials under uncertainty: countering the curse of dimensionality. Comput Methods Appl Mech Eng 320:633–667

    MathSciNet  MATH  Google Scholar 

  2. 2.

    Camastra F, Staiano A (2016) Intrinsic dimension estimation: advances and open problems. Inf Sci 328(4):26–41

    MATH  Google Scholar 

  3. 3.

    Cárdenas EH, Camargo HA, Túpac YJ (2016) Imbalanced datasets in the generation of fuzzy classification systems—an investigation using a multiobjective evolutionary algorithm based on decomposition. In: International conference on fuzzy systems and knowledge discovery, pp 1145–1452

  4. 4.

    Chen S, Wang Z, Tian Y (2007) Matrix-pattern-oriented ho-kashyap classifier with regularization learning. Pattern Recognit 40(5):1533–1543

    MATH  Google Scholar 

  5. 5.

    Cormen TH, Leiserson Charles E, Rivest Ronald L, Stein Clifford (2009) Introduction to algorithms, 3rd edn. The MIT Press, Cambridge

    MATH  Google Scholar 

  6. 6.

    Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  7. 7.

    Demšar Janez (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30

    MathSciNet  MATH  Google Scholar 

  8. 8.

    Duda RO, Hart PE, Stork DG (2012) Pattern Classif. Wiley, Hoboken

    Google Scholar 

  9. 9.

    Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188

    Google Scholar 

  10. 10.

    Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: EuroCOLT ’95 proceedings of the 2nd european conference on computational learning theory, pp 23–27

  11. 11.

    Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F (2014) A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C Appl Rev 42(4):463–484

    Google Scholar 

  12. 12.

    Germain P, Lacasse A, Marchand M (2009) Pac-bayesian learning of linear classifiers. In: International conference on machine learning, pp 353–360

  13. 13.

    Gong M, Jiang X, Li H (2017) Optimization methods for regularization-based ill-posed problems: a survey and a multi-objective framework. Front Comput Sci 11(3):362–391

    MATH  Google Scholar 

  14. 14.

    He H, Ma Y (2013) Imbalanced learning: foundations, algorithms, and applications. Wiley-IEEE Press, Hoboken

    MATH  Google Scholar 

  15. 15.

    Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844

    Google Scholar 

  16. 16.

    Hollander M, Wolfe D, Chicken E (2013) Nonparametric statistical methods. Wiley, Hoboken

    MATH  Google Scholar 

  17. 17.

    Iman RL, Davenport JM (1980) Approximations of the critical region of the friedman statistic. Commun Stat 9:571–595

    MATH  Google Scholar 

  18. 18.

    Koltchinskii V, Panchenko D (2000) Rademacher processes and bounding the risk of function learning. In: High dimensional probability II. Springer, Berlin, pp 443–457

  19. 19.

    Kullback S (1997) Information theory and statistics. Dover Publications, Mineola

    MATH  Google Scholar 

  20. 20.

    Langford J (2005) Tutorial on practical prediction theory for classification. J Mach Learn Res 6(3):273–306

    MathSciNet  MATH  Google Scholar 

  21. 21.

    Langford J, Shawe-Taylor J (2003) PAC-Bayes and margins. In: NIPS’02 proceedings of the 15th international conference on neural information processing systems. MIT Press, Cambridge, MA, USA, pp 439–446

  22. 22.

    Leski J (2003) Ho-kashyap classifier with generalization control. Pattern Recognit Lett 24(14):2281–2290

    MATH  Google Scholar 

  23. 23.

    Liu X, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern Part B: Cybern 39(2):539–550

    Google Scholar 

  24. 24.

    Mukherjee S, Niyogi P, Poggio T, Rifkin R (2006) Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Adv Comput Math 25(1–3):161–193

    MathSciNet  MATH  Google Scholar 

  25. 25.

    Nemenyi PB (1963) Distribution-free multiple comparisons. PhD thesis, Princeton University

  26. 26.

    Ng WW, Hu J, Yeung DS, Yin S, Roli F (2017) Diversified sensitivity-based undersampling for imbalance classification problems. IEEE Trans Cybern 45(11):2402–2412

    Google Scholar 

  27. 27.

    Schölkopf B, Platt J, Hofmann T (2006) Tighter Pac-Bayes bounds. In: International conference on neural information processing systems, pp 9–16

  28. 28.

    Seeger M (2002) Pac-bayesian generalisation error bounds for gaussian process classification. J Mach Learn Res 3(2):233–269

    MathSciNet  MATH  Google Scholar 

  29. 29.

    Shao G, Sang N (2017) Regularized max-min linear discriminant analysis. Pattern Recognit 66:353–363

    Google Scholar 

  30. 30.

    Sun ZB, Song QB, Zhu XY, Sun HL, Xu BW, Zhou YM (2015) A novel ensemble method for classifying imbalanced data. Pattern Recognit 48(5):1623–1637

    Google Scholar 

  31. 31.

    Wang Z, Cao C (2019) Cascade interpolation learning with double subspaces and confidence disturbance for imbalanced problems. Neural Netw 118:17–31

    MATH  Google Scholar 

  32. 32.

    Wang Z, Chen S, Liu J, Zhang D (2008) Pattern representation in feature extraction and classifier design: matrix versus vector. IEEE Trans Neur Network 19(5):758–769

    Google Scholar 

  33. 33.

    Yang Y, Jiang J (2006) Considering cost asymmetry in learning classifiers. J Mach Learn Res 7:1713–1741

    MathSciNet  MATH  Google Scholar 

  34. 34.

    Yang Z, Tang W, Shintemirov A, Wu Q (2009) Association rule miningbased dissolved gas analysis for fault diagnosis of power transformers. IEEE Trans Syst Man Cybern Part C (Appl Rev) 39(6):597–610

    Google Scholar 

  35. 35.

    Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recognit 77:160–172

    Google Scholar 

  36. 36.

    Zhu Z, Wang Z, Li D, Zhu Y, Du W (2018) Geometric structural ensemble learning for imbalanced problems. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2018.2877663

    Article  Google Scholar 

  37. 37.

    Zieba M (2014) Service-oriented medical system for supporting decisions with missing and imbalanced data. IEEE J Biomed Health Inf 18(5):1533–1540

    Google Scholar 

Download references

Acknowledgements

This work is supported by Natural Science Foundation of China under Grant No. 61672227, ‘Shuguang Program’ supported by Shanghai Education Development Foundation and Shanghai Municipal Education Commission, Natural Science Foundations of China under Grant No. 61806078, National Science Foundation of China for Distinguished Young Scholars under Grant 61725301, National Key R&D Program of China under Grant No. 2018YFC0910500, National Major Scientific and Technological Special Project for “Significant New Drugs Development” under Grant No. 2019ZX09201004, and the Special Fund Project for Shanghai Informatization Development in Big Data under Grant 201901043.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Zhe Wang or Dongdong Li.

Ethics declarations

Conflict of interest

The authors of this manuscript state that there is no conflicts of interests between this manuscript and other published works.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhu, Z., Wang, Z., Li, D. et al. Efficient matrixized classification learning with separated solution process. Neural Comput & Applic 32, 10609–10632 (2020). https://doi.org/10.1007/s00521-019-04595-x

Download citation

Keywords

  • Matrixized classifier
  • Training speed
  • Imbalanced problems
  • PAC-Bayes bound
  • Pattern recognition