Multiple-Side Multiple-Learner for Incomplete Data Classification

  • Yuan-ting Yan
  • Yan-Ping ZhangEmail author
  • Xiu-Quan Du
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9437)


Selective classifier can improve classification accuracy and algorithm efficiency by removing the irrelevant attributes of data. However, most of them deal with complete data. Actual datasets are often incomplete due to various reasons. Incomplete dataset also have some irrelevant attributes which have a negative effect on the algorithm performance. By analyzing main classification methods of incomplete data, this paper proposes a Multiple-side Multiple-learner algorithm for incomplete data (MSML). MSML first obtains a feature subset of the original incomplete dataset based on the chi-square statistic. And then, according to the missing attribute values of the selected feature subset, MSML obtains a group of data subsets. Each data subset was used to train a sub classifier based on bagging algorithm. Finally, the results of different sub classifiers were combined by weighted majority voting. Experimental results on UCI incomplete datasets show that MSML can effectively reduce the number of attributes, and thus improve the algorithm execution efficiency. At the same time, it can improve the classification accuracy and algorithm stability too.


Incomplete data Multiple-side Feature subset Multiple-learner 



This work was supported by National Natural Science Foundation of China (Nos.61175046 and 61203290).


  1. 1.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHGoogle Scholar
  2. 2.
    Qian, Y., Liang, J., Pedrycz, W., Dang, C.: Positive approximation: an accelerator for attribute reduction in rough set theory. Artif. Intell. 174(9), 597–618 (2010)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Kuncheva, L.I., Rodrguez, J.J., Plumpton, C.O., et al.: Random subspace ensembles for fMRI classification. IEEE Trans. Med. Imaging 29(2), 531–542 (2010)CrossRefGoogle Scholar
  4. 4.
    Zhang, J., Zhang, D.: A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples. Pattern Recogn. 44(6), 1162–1171 (2011)CrossRefGoogle Scholar
  5. 5.
    Sun, S., Zhang, C.: Subspace ensembles for classification. Phys. A Stat. Mech. Appl. 385(1), 199–207 (2007)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recogn. 36(6), 1291–1302 (2003)CrossRefGoogle Scholar
  7. 7.
    Allison, P.D.: Missing Data. Sage Publications, Thousand Oaks (2001)zbMATHGoogle Scholar
  8. 8.
    Roderick L., J A, Rubin, D.B.: Statistical Analysis with Missing Data, vol. 43, no. 4, pp. 364–365. Wiley, New York (2002)Google Scholar
  9. 9.
    Gheyas, I.A., Smith, L.S.: A neural network-based framework for the reconstruction of incomplete data sets. Neurocomputing 73(16), 3039–3065 (2010)CrossRefGoogle Scholar
  10. 10.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B (Methodol.) 39, 1–38 (1977)MathSciNetzbMATHGoogle Scholar
  11. 11.
    Russell, S., Binder, J., Koller, D., Kanazawa, K.: Local learning in probabilistic networks with hidden variables. In: Proceedings of IJCAI 1995, pp. 1146–1152 (1995)Google Scholar
  12. 12.
    Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 721–741 (1984)CrossRefGoogle Scholar
  13. 13.
    Williams, D., Liao, X., Xue, Y., Carin, L., Krishnapuram, B.: On classification with incomplete data. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 427–436 (2007)CrossRefGoogle Scholar
  14. 14.
    Ramoni, M., Sebastiani, P.: Robust Bayes classifiers. Artif. Intell. 125(1), 209–226 (2001)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Krause, S., Polikar, R.: An ensemble of classifiers approach for the missing feature problem. In: IEEE Proceedings of the International Joint Conference on Neural Networks, vol. 1, pp. 553–558 (2003)Google Scholar
  16. 16.
    Chen, H., Du, Y., Jiang, K.: Classification of incomplete data using classifier ensembles. In: IEEE International Conference on Systems and Informatics. pp. 2229–2232 (2012)Google Scholar
  17. 17.
    Yan, Y.-T., Zhang, Y.-P., Zhang, Y.-W.: Multi-granulation ensemble classification for incomplete data. In: Miao, D., Pedrycz, W., Slezak, D., Peters, G., Hu, Q., Wang, R. (eds.) RSKT 2014. LNCS, vol. 8818, pp. 343–351. Springer, Heidelberg (2014) Google Scholar
  18. 18.
    UCI Repository of machine learning databases for classification.
  19. 19.
    Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, School of Computer Science and TechnologyAnhui UniversityHefeiChina

Personalised recommendations