Skip to main content

A Rough Set Approach to Incomplete Data

  • Conference paper
  • First Online:
Rough Sets and Knowledge Technology (RSKT 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9436))

Included in the following conference series:

Abstract

This paper presents main directions of research on a rough set approach to incomplete data. First, three different types of lower and upper approximations, based on the characteristic relation, are defined. Then an idea of the probabilistic approximation, an extension of lower and upper approximations, is presented. Local probabilistic approximations are also discussed. Finally, some special topics such as consistency of incomplete data and a problem of increasing data set incompleteness to improve rule set quality, in terms of an error rate, are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth & Brooks, Monterey (1984)

    MATH  Google Scholar 

  2. Chan, C.C., Grzymala-Busse, J.W.: On the attribute redundancy and the learning programs ID3, PRISM, and LEM2. Technical report, Department of Computer Science, University of Kansas (1991)

    Google Scholar 

  3. Clark, P.G., Grzymala-Busse, J.W.: Experiments on probabilistic approximations. In: Proceedings of the 2011 IEEE International Conference on Granular Computing, pp. 144–149 (2011)

    Google Scholar 

  4. Clark, P.G., Grzymala-Busse, J.W.: Consistency of incomplete data. In: Proceedings of the Second International Conference on Data Technologies and Applications, pp. 80–87 (2013)

    Google Scholar 

  5. Clark, P.G., Grzymala-Busse, J.W.: A comparison of two versions of the MLEM2 rule induction algorithm extended to probabilistic approximations. In: Cornelis, C., Kryszkiewicz, M., Ślȩzak, D., Ruiz, E.M., Bello, R., Shang, L. (eds.) RSCTC 2014. LNCS, vol. 8536, pp. 109–119. Springer, Heidelberg (2014)

    Google Scholar 

  6. Clark, P.G., Grzymala-Busse, J.W., Hippe, Z.S.: An analysis of probabilistic approximations for rule induction from incomplete data sets. Fundam. Informaticae 55, 365–379 (2014)

    Article  Google Scholar 

  7. Clark, P.G., Grzymala-Busse, J.W., Kuehnhausen, M.: Local probabilistic approximations for incomplete data. In: Chen, L., Felfernig, A., Liu, J., Raś, Z.W. (eds.) ISMIS 2012. LNCS, vol. 7661, pp. 93–98. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  8. Clark, P.G., Grzymala-Busse, J.W., Kuehnhausen, M.: Mining incomplete data with many missing attribute values. A comparison of probabilistic and rough set approaches. In: Proceedings of the Second International Conference on Intelligent Systems and Applications, pp. 12–17 (2013)

    Google Scholar 

  9. Clark, P.G., Grzymala-Busse, J.W., Rzasa, W.: Mining incomplete data with singleton, subset and concept approximations. Inf. Sci. 280, 368–384 (2014)

    Article  MathSciNet  Google Scholar 

  10. Cyran, K.A.: Modified indiscernibility relation in the theory of rough sets with real-valued attributes: application to recognition of fraunhofer diffraction patterns. Trans. Rough Sets 9, 14–34 (2008)

    Google Scholar 

  11. Dai, J.: Rough set approach to incomplete numerical data. Inf. Sci. 241, 43–57 (2013)

    Article  MathSciNet  Google Scholar 

  12. Dai, J., Xu, Q.: Approximations and uncertainty measures in incomplete information systems. Inf. Sci. 198, 62–80 (2012)

    Article  MathSciNet  Google Scholar 

  13. Dai, J., Xu, Q., Wang, W.: A comparative study on strategies of rule induction for incomplete data based on rough set approach. Int. J. Adv. Comput. Technol. 3, 176–183 (2011)

    Google Scholar 

  14. Dardzinska, A., Ras, Z.W.: Chasing unknown values in incomplete information systems. In: Workshop Notes, Foundations and New Directions of Data Mining, in Conjunction with the 3-rd International Conference on Data Mining, pp. 24–30 (2003)

    Google Scholar 

  15. Dardzinska, A., Ras, Z.W.: On rule discovery from incomplete information systems. In: Workshop Notes, Foundations and New Directions of Data Mining, in Conjunction with the 3-rd International Conference on Data Mining, pp. 24–30 (2003)

    Google Scholar 

  16. Greco, S., Matarazzo, B., Slowinski, R.: Dealing with missing data in rough set analysis of multi-attribute and multi-criteria decision problems. In: Zanakis, H., Doukidis, G., Zopounidis, Z. (eds.) Decision Making: Recent developments and Worldwide Applications, pp. 295–316. Kluwer Academic Publishers, Dordrecht (2000)

    Chapter  Google Scholar 

  17. Grzymala-Busse, J.W.: On the unknown attribute values in learning from examples. In: Raś, Z.W., Zemankova, M. (eds.) ISMIS 1991. LNCS, vol. 542, pp. 368–377. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  18. Grzymala-Busse, J.W.: LERS–a system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support, pp. 3–18. Handbook of Applications and Advances of the Rough Set Theory. Kluwer Academic Publishers, Dordrecht (1992)

    Chapter  Google Scholar 

  19. Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)

    Article  MATH  Google Scholar 

  20. Grzymala-Busse, J.W.: MLEM2: A new algorithm for rule induction from imperfect data. In: Proceedings of the 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pp. 243–250 (2002)

    Google Scholar 

  21. Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Notes of the Workshop on Foundations and New Directions of Data Mining, in Conjunction with the Third International Conference on Data Mining, pp. 56–63 (2003)

    Google Scholar 

  22. Grzymała-Busse, J.W.: Characteristic relations for incomplete data: a generalization of the indiscernibility relation. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 244–253. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  23. Grzymala-Busse, J.W.: Data with missing attribute values: generalization of indiscernibility relation and rule induction. Trans. Rough Sets 1, 78–95 (2004)

    Google Scholar 

  24. Grzymala-Busse, J.W.: Three approaches to missing attribute values–a rough set perspective. In: Proceedings of the Workshop on Foundation of Data Mining, in Conjunction with the Fourth IEEE International Conference on Data Mining, pp. 55–62 (2004)

    Google Scholar 

  25. Grzymała-Busse, J.W.: Incomplete data and generalization of indiscernibility relation, definability, and approximations. In: Slezak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 244–253. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  26. Grzymala-Busse, J.W.: A comparison of traditional and rough set approaches to missing attribute values in data mining. In: Proceedings of the 10-th International Conference on Data Mining, Detection, Protection and Security, Royal Mare Village, Crete, pp. 155–163 (2009)

    Google Scholar 

  27. Grzymala-Busse, J.W.: Mining data with missing attribute values: A comparison of probabilistic and rough set approaches. In: Proceedings of the 4-th International Conference on Intelligent Systems and Knowledge Engineering, pp. 153–158 (2009)

    Google Scholar 

  28. Grzymala-Busse, J.W.: Rough set and CART approaches to mining incomplete data. In: Proceedings of the International Conference on Soft Computing and Pattern Recognition, IEEE Computer Society, pp. 214–219 (2010)

    Google Scholar 

  29. Grzymala-Busse, J.W.: A comparison of some rough set approaches to mining symbolic data with missing attribute values. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2011. LNCS, vol. 6804, pp. 52–61. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  30. Grzymała-Busse, J.W.: Generalized parameterized approximations. In: Yao, J.T., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 136–145. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  31. Grzymala-Busse, J.W., Clark, P.G., Kuehnhausen, M.: Generalized probabilistic approximations of incomplete data. Int. J. Approximate Reasoning 132, 180–196 (2014)

    Article  MathSciNet  Google Scholar 

  32. Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Handling missing attribute values. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 37–57. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  33. Grzymala-Busse, J.W., Grzymala-Busse, W.J.: An experimental comparison of three rough set approaches to missing attribute values. Trans. Rough Sets 6, 31–50 (2007)

    MATH  Google Scholar 

  34. Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Improving quality of rule sets by increasing incompleteness of data sets. In: Proceedings of the Third International Conference on Software and Data Technologies, pp. 241–248 (2008)

    Google Scholar 

  35. Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Inducing better rule sets by adding missing attribute values. In: Chan, C.-C., Grzymala-Busse, J.W., Ziarko, W.P. (eds.) RSCTC 2008. LNCS (LNAI), vol. 5306, pp. 160–169. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  36. Grzymala-Busse, J.W., Grzymala-Busse, W.J.: Handling missing attribute values. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn, pp. 33–51. Springer, Heidelberg (2010)

    MATH  Google Scholar 

  37. Grzymala-Busse, J.W., Grzymala-Busse, W.J., Goodwin, L.K.: A comparison of three closest fit approaches to missing attribute values in preterm birth data. Int. J. Intell. Syst. 17(2), 125–134 (2002)

    Article  Google Scholar 

  38. Grzymala-Busse, J.W., Grzymala-Busse, W.J., Hippe, Z.S., Rzasa, W.: An improved comparison of three rough set approaches to missing attribute values. In: Proceedings of the 16-th International Conference on Intelligent Information Systems, pp. 141–150 (2008)

    Google Scholar 

  39. Grzymala-Busse, J.W., Hippe, Z.S.: Mining data with numerical attributes and missing attribute values–a rough set approach. In: Proceedings of the IEEE International Conference on Granular Computing, pp. 144–149 (2011)

    Google Scholar 

  40. Grzymała-Busse, J.W., Hu, M.: A comparison of several approaches to missing attribute values in data mining. In: Ziarko, W.P., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, p. 378. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  41. Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. In: Greco, S., Hata, Y., Hirano, S., Inuiguchi, M., Miyamoto, S., Nguyen, H.S., Słowiński, R. (eds.) RSCTC 2006. LNCS (LNAI), vol. 4259, pp. 244–253. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  42. Grzymala-Busse, J.W., Rzasa, W.: Local and global approximations for incomplete data. Trans. Rough Sets 8, 21–34 (2008)

    MathSciNet  MATH  Google Scholar 

  43. Grzymala-Busse, J.W., Rzasa, W.: A local version of the MLEM2 algorithm for rule induction. Fundamenta Informaticae 100, 99–116 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  44. Grzymala-Busse, J.W., Wang, A.Y.: Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. In: Proceedings of the 5-th International Workshop on Rough Sets and Soft Computing in conjunction with the Third Joint Conference on Information Sciences, pp. 69–72 (1997)

    Google Scholar 

  45. Grzymala-Busse, J.W., Yao, Y.: Probabilistic rule induction with the LERS data mining system. Int. J. Intell. Syst. 26, 518–539 (2011)

    Article  Google Scholar 

  46. Grzymala-Busse, J.W., Ziarko, W.: Data mining based on rough sets. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 142–173. Idea Group Publ, Hershey (2003)

    Chapter  Google Scholar 

  47. Guan, L., Wang, G.: Generalized approximations defined by non-equivalence relations. Inf. Sci. 193, 163–179 (2012)

    Article  MathSciNet  Google Scholar 

  48. Hong, T.P., Tseng, L.H., Chien, B.C.: Learning coverage rules from incomplete data based on rough sets. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 3226–3231 (2004)

    Google Scholar 

  49. Hong, T.P., Tseng, L.H., Wang, S.L.: Learning rules from incomplete training examples by rough sets. Expert Syst. Appl. 22, 285–293 (2002)

    Article  Google Scholar 

  50. Kryszkiewicz, M.: Rough set approach to incomplete information systems. In: Proceedings of the Second Annual Joint Conference on Information Sciences, pp. 194–197 (1995)

    Google Scholar 

  51. Kryszkiewicz, M.: Rules in incomplete information systems. Inf. Sci. 113(3–4), 271–292 (1999)

    Article  MathSciNet  Google Scholar 

  52. Latkowski, R.: On decomposition for incomplete data. Fundamenta Informaticae 54, 1–16 (2003)

    MathSciNet  MATH  Google Scholar 

  53. Latkowski, R., Mikołajczyk, M.: Data decomposition and decision rule joining for classification of data with missing values. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 254–263. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  54. Leung, Y., Wu, W., Zhang, W.: Knowledge acquisition in incomplete information systems: a rough set approach. Eur. J. Ope. Res. 168, 164–180 (2006)

    Article  MathSciNet  Google Scholar 

  55. Li, D., Deogun, I., Spaulding, W., Shuart, B.: Dealing with missing data: algorithms based on fuzzy set and rough set theories. Trans. Rough Sets 4, 37–57 (2005)

    MATH  Google Scholar 

  56. Li, H., Yao, Y., Zhou, X., Huang, B.: Two-phase rule induction from incomplete data. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 47–54. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  57. Li, T., Ruan, D., Geert, W., Song, J., Xu, Y.: A rough sets based characteristic relation approach for dynamic attribute generalization in data mining. Knowl. Based Syst. 20(5), 485–494 (2007)

    Article  Google Scholar 

  58. Li, T., Ruan, D., Song, J.: Dynamic maintenance of decision rules with rough set under characteristic relation. In: Proceedings of the International Conference on Wireless Communications, Networking and Mobile Computing, pp. 3713–3716 (2007)

    Google Scholar 

  59. Meng, Z., Shi, Z.: A fast approach to attribute reduction in incomplete decision systems with tolerance relation-based rough sets. Inf. Sci. 179, 2774–2793 (2009)

    Article  MathSciNet  Google Scholar 

  60. Meng, Z., Shi, Z.: Extended rough set-based attribute reduction in inconsistent incomplete decision systems. Inf. Sci. 204, 44–69 (2012)

    Article  MathSciNet  Google Scholar 

  61. Nakata, M., Sakai, H.: Rough sets handling missing values probabilistically interpreted. In: Slezak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS (LNAI), vol. 3641, pp. 325–334. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  62. Nakata, M., Sakai, H.: Applying rough sets to information tables containing missing values. In: Proceedings of the 39-th International Symposium on Multiple-Valued Logic, pp. 286–291 (2009)

    Google Scholar 

  63. Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11, 341–356 (1982)

    Article  Google Scholar 

  64. Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)

    Book  Google Scholar 

  65. Pawlak, Z., Grzymala-Busse, J.W., Slowinski, R., Ziarko, W.: Rough sets. Commun. ACM 38, 89–95 (1995)

    Article  Google Scholar 

  66. Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177, 28–40 (2007)

    Article  MathSciNet  Google Scholar 

  67. Pawlak, Z., Wong, S.K.M., Ziarko, W.: Rough sets: probabilistic versus deterministic approach. Int. J. Man-Mach. Stud. 29, 81–95 (1988)

    Article  Google Scholar 

  68. Peng, H., Zhu, S.: Handling of incomplete data sets using ICA and SOM in data mining. Neural Comput. Appl. 16, 167–172 (2007)

    Article  Google Scholar 

  69. Qi, Y.S., Wei, L., Sun, H.J., Song, Y.Q., Sun, Q.S.: Characteristic relations in generalized incomplete information systems. In: International Workshop on Knowledge Discovery and Data Mining, pp. 519–523 (2008)

    Google Scholar 

  70. Qi, Y.S., Sun, H., Yang, X.B., Song, Y., Sun, Q.: Approach to approximate distribution reduct in incomplete ordered decision system. J. Inf. Comput. Sci. 3, 189–198 (2008)

    Google Scholar 

  71. Qian, Y., Dang, C., Liang, J., Zhang, H., Ma, J.: On the evaluation of the decision performance of an incomplete decision table. Data Knowl. Eng. 65, 373–400 (2008)

    Article  Google Scholar 

  72. Qian, Y., Li, D., Wang, F., Ma, N.: Approximation reduction in inconsistent incomplete decision tables. Knowl. Based Syst. 23, 427–433 (2010)

    Article  Google Scholar 

  73. Ślȩzak, D., Ziarko, W.: The investigation of the bayesian rough set model. Int. J. Approx. Reason. 40, 81–91 (2005)

    Article  MathSciNet  Google Scholar 

  74. Song, J., Li, T., Ruan, D.: A new decision tree construction using the cloud transform and rough sets. In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT 2008. LNCS (LNAI), vol. 5009, pp. 524–531. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  75. Song, J., Li, T., Wang, Y., Qi, J.: Decision tree construction based on rough set theory under characteristic relation. In: Proceedings of the ISKE 2007, the 2-nd International Conference on Intelligent Systems and Knowledge Engineering Conference, pp. 788–792 (2007)

    Google Scholar 

  76. Stefanowski, J., Tsoukiàs, A.: On the extension of rough sets under incomplete information. In: Zhong, N., Skowron, A., Ohsuga, S. (eds.) RSFDGrC 1999. LNCS (LNAI), vol. 1711, pp. 73–82. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  77. Stefanowski, J., Tsoukias, A.: Incomplete information tables and rough classification. Computat. Intell. 17(3), 545–566 (2001)

    Article  Google Scholar 

  78. Wang, G.: Extension of rough set under incomplete information systems. In: Proceedings of the IEEE International Conference on Fuzzy Systems, pp. 1098–1103 (2002)

    Google Scholar 

  79. Wong, S.K.M., Ziarko, W.: INFER–an adaptive decision support system based on the probabilistic approximate classification. In: Proceedings of the 6-th International Workshop on Expert Systems and their Applications, pp. 713–726 (1986)

    Google Scholar 

  80. Yang, X., Yang, J.: Incomplete Information System and Rough Set Theory: Model and Attribute Reduction. Springer, Heidelberg (2012)

    Book  Google Scholar 

  81. Yang, X., Zhang, M., Dou, H., Yang, J.: Neighborhood systems-based rough sets in incomplete information systems. Knowl. Based Syst. 24, 858–867 (2011)

    Article  Google Scholar 

  82. Yao, Y.Y.: Probabilistic rough set approximations. Int. J. Approx. Reason. 49, 255–271 (2008)

    Article  Google Scholar 

  83. Yao, Y.Y., Wong, S.K.M.: A decision theoretic framework for approximate concepts. Int. J. Man Mach. Stud. 37, 793–809 (1992)

    Article  Google Scholar 

  84. Yao, Y.Y., Wong, S.K.M., Lingras, P.: A decision-theoretic rough set model. In: Ras, Z.W., Zemankova, M., Emrich, M.L. (eds.) Methodologies for Intelligent Systems, North-Holland, pp. 388–395 (1990)

    Google Scholar 

  85. Ziarko, W.: Variable precision rough set model. J. Comput. Syst. Sci. 46(1), 39–59 (1993)

    Article  MathSciNet  Google Scholar 

  86. Ziarko, W.: Probabilistic approach to rough sets. Int. J. Approx. Reason. 49, 272–284 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerzy W. Grzymala-Busse .

Editor information

Editors and Affiliations

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Grzymala-Busse, J.W. (2015). A Rough Set Approach to Incomplete Data. In: Ciucci, D., Wang, G., Mitra, S., Wu, WZ. (eds) Rough Sets and Knowledge Technology. RSKT 2015. Lecture Notes in Computer Science(), vol 9436. Springer, Cham. https://doi.org/10.1007/978-3-319-25754-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25754-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25753-2

  • Online ISBN: 978-3-319-25754-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics