Data Classification Using Rough Sets and Naïve Bayes

  • Khadija Al-Aidaroos
  • Azuraliza Abu Bakar
  • Zalinda Othman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6401)


Naïve Bayesian classifier is one of the most effective and efficient classification algorithms. The elegant simplicity and apparent accuracy of naive Bayes (NB) even when the independence assumption is violated, fosters the on-going interest in the model. Rough Sets Theory has been used for different tasks in knowledge discovery and successfully applied in many real-life problems. In this study we make use of rough sets ability, in discovering attributes dependencies, to overcome the NB un-practical assumption. We propose a new algorithm called Rough-Naive Bayes (RNB) that is expected to outperform other current NB variants. RNB is based on adjusting attributes’ weights based on their dependencies and contribution to the final decision. Experimental results show that RNB can achieve better performance than NB classifier.


Classification Naïve Bayes (NB) NB variants Rough Sets (RS) attribute dependency weighted NB 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)zbMATHCrossRefGoogle Scholar
  2. 2.
    Rish, I.: An Empirical Study of the Naïve Bayes Classifier. In: Proceedings of the Int. Joint Conf. on Artificial Intelligence, Workshop on Empirical Methods in AI (2001)Google Scholar
  3. 3.
    Al-Aidaroos, K.M., Bakar, A.A., Othman, Z.: Naïve Bayes Variants in Classification Learning. In: Int. conf. on Information Retrieval and Knowledge Management, pp. 276–281 (2010)Google Scholar
  4. 4.
    Pawlak, Z.: Rough Sets and Intelligent Data Analysis. Information Sciences 147, 1–12 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Thangavel, K., Pethalakshmi, A.: Dimensionality Reduction based on Rough Set Theory: A Review. Applied Soft Computing 9, 1–12 (2009)CrossRefGoogle Scholar
  6. 6.
    Pawlak, Z.: Rough Set Approach to Knowledge-based Decision Support. European Journal of Operational Research 99, 48–57 (1997)zbMATHCrossRefGoogle Scholar
  7. 7.
    Pawlak, Z., Skowron, A.: Rudiment of Rough Sets. Information Sciences 177, 3–27 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Skowron, A.: Rough Sets in KDD (2000)Google Scholar
  9. 9.
    Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29, 103–130 (1997)zbMATHCrossRefGoogle Scholar
  10. 10.
    Hassan, S.Z., Verma, B.: A Hybrid Data Mining Approach for Knowledge Extraction and Classification in Medical Databases. In: IEEE Seventh International Conference on Intelligent Systems Design and Applications, pp. 503–508 (2007)Google Scholar
  11. 11.
    Pattaraintakorn, P., Cercone, N.: Integrating Rough Set Theory and Medical Applications. Applied Mathematics Letters 21, 400–403 (2008)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Li, R., Zhao, Y., Zhang, F., Song, L.: Rough Sets in Hybrid Soft Computing Systems. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 35–44. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Kohavi, R.: Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 202–207 (1996)Google Scholar
  14. 14.
    Zheng, Z., Webb, G.I.: Lazy Learning of Bayesian Rules. Machine Learning 41, 53–84 (2000)CrossRefGoogle Scholar
  15. 15.
    Ratanamahatana, C.A., Gunopulos, D.: Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection. In: proceedings of Workshop on Data Cleaning and Preprocessing (DCAP 2002), at IEEE International Conference on Data Mining, ICDM 2002 (2002)Google Scholar
  16. 16.
    Frank, E., Hall, M., Pfahringer, B.: Locally Weighted Naive Bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 249–256. Morgan Kaufmann, Seattle (2003)Google Scholar
  17. 17.
    Webb, G.I., Boughton, J., Wang, Z.: Not so Naive Bayes: Aggregating One-Dependence Estimators. Machine Learning 58, 5–24 (2005)zbMATHCrossRefGoogle Scholar
  18. 18.
    Ji, Y., Shang, L.: RoughTree: A Classifier with Naïve-Bayes and Rough Sets Hybrid in Decision Tree Representation. In: 2007 IEEE International Conference on Granular Computing, pp. 221–226 (2007)Google Scholar
  19. 19.
    Hall, M.: A Decision Tree-Based Attribute Weighting Filtering for Naïve Bayes. Knowledge-Based Systems 20, 120–126 (2007)CrossRefGoogle Scholar
  20. 20.
    Zhang, H., Sheng, S.: Learning Weighted Naïve Bayes with Accurate Ranking. In: Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM 2004 (2004)Google Scholar
  21. 21.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Google Scholar
  22. 22.
    The ROSETTA rough set toolkit,
  23. 23.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, Department of Information and Computer Science, Irvine (2007),
  24. 24.
    Kohavi, R.: A Study of Cross-validation and Bootstrap for Accuracy Estimation and Model Selection. In: International joint Conference on artificial intelligence, pp. 1137–1145 (1995)Google Scholar
  25. 25.
    Ziarko, W.: Variable Precision Rough Set Model. Journal of Computer and System Sciences 46, 39–59 (1993)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Khadija Al-Aidaroos
    • 1
  • Azuraliza Abu Bakar
    • 1
  • Zalinda Othman
    • 1
  1. 1.Center for Artificial Intelligence Technology (CAIT), Faculty of Information and Science TechnologyUniversiti Kebangsaan MalaysiaBangiMalaysia

Personalised recommendations