Mining Incomplete Data with Many Lost and Attribute-Concept Values

  • Patrick G. Clark
  • Jerzy W. Grzymala-BusseEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9436)


This paper presents experimental results on twelve data sets with many missing attribute values, interpreted as lost values and attribute-concept values. Data mining was accomplished using three kinds of probabilistic approximations: singleton, subset and concept. We compared the best results, using all three kinds of probabilistic approximations, for six data sets with lost values and six data sets with attribute-concept values, where missing attribute values were located in the same places. For five pairs of data sets the error rate, evaluated by ten-fold cross validation, was significantly smaller for lost values than for attribute-concept values (5 % significance level). For the remaining pair of data sets both interpretations of missing attribute values do not differ significantly.


Incomplete data Lost values Attribute-concept values Probabilistic approximations MLEM2 rule induction algorithm 


  1. 1.
    Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)Google Scholar
  2. 2.
    Clark, P.G., Grzymala-Busse, J.W.: Rule induction using probabilistic approximations and data with missing attribute values. In: Proceedings of the 15-th IASTED International Conference on Artificial Intelligence and Soft Computing ASC 2012, pp. 235–242 (2012)Google Scholar
  3. 3.
    Clark, P.G., Grzymała-Busse, J.W.: An experimental comparison of three interpretations of missing attribute values using probabilistic approximations. In: Ciucci, D., Inuiguchi, M., Yao, Y., śȩzak, D. (eds.) RSFDGrC 2013. LNCS, vol. 8170, pp. 77–86. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  4. 4.
    Clark, P.G., Grzymala-Busse, J.W.: Mining incomplete data with lost values and attribute-concept values. In: Proceedings of the 2014 IEEE International Conference on Granular Computing, pp. 49–54 (2014)Google Scholar
  5. 5.
    Grzymala-Busse, J.W.: LERS–a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Set Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)Google Scholar
  6. 6.
    Grzymala-Busse, J.W.: MLEM2: a new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)Google Scholar
  7. 7.
    Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011) CrossRefGoogle Scholar
  8. 8.
    Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)Google Scholar
  9. 9.
    Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ., Hershey (2003)CrossRefGoogle Scholar
  10. 10.
    Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man-Mach. Stud. 29, 81–95 (1988)CrossRefGoogle Scholar
  12. 12.
    Ślȩzak, D., Ziarko, W.: The investigation of the bayesian rough set model. Int. J. Approximate Reasoning 40, 81–91 (2005)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Comput. Intell. 17(3), 545–566 (2001)CrossRefGoogle Scholar
  14. 14.
    Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)Google Scholar
  15. 15.
    Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approximate Reasoning 49, 255–271 (2008)CrossRefGoogle Scholar
  16. 16.
    Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man Mach. Studies 37, 793–809 (1992)CrossRefGoogle Scholar
  17. 17.
    Ziarko, W.: Variable precision rough set model. J. Comput. Sys. Sci. 46(1), 39–59 (1993)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approximate Reasoning 49, 272–284 (2008)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Computer ScienceUniversity of KansasLawrenceUSA
  2. 2.Department of Expert Systems and Artificial IntelligenceUniversity of Information Technology and ManagementRzeszowPoland

Personalised recommendations