Abstract
Naïve Bayesian classifier is one of the most effective and efficient classification algorithms. The elegant simplicity and apparent accuracy of naive Bayes (NB) even when the independence assumption is violated, fosters the on-going interest in the model. Rough Sets Theory has been used for different tasks in knowledge discovery and successfully applied in many real-life problems. In this study we make use of rough sets ability, in discovering attributes dependencies, to overcome the NB un-practical assumption. We propose a new algorithm called Rough-Naive Bayes (RNB) that is expected to outperform other current NB variants. RNB is based on adjusting attributes’ weights based on their dependencies and contribution to the final decision. Experimental results show that RNB can achieve better performance than NB classifier.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)
Rish, I.: An Empirical Study of the Naïve Bayes Classifier. In: Proceedings of the Int. Joint Conf. on Artificial Intelligence, Workshop on Empirical Methods in AI (2001)
Al-Aidaroos, K.M., Bakar, A.A., Othman, Z.: Naïve Bayes Variants in Classification Learning. In: Int. conf. on Information Retrieval and Knowledge Management, pp. 276–281 (2010)
Pawlak, Z.: Rough Sets and Intelligent Data Analysis. Information Sciences 147, 1–12 (2002)
Thangavel, K., Pethalakshmi, A.: Dimensionality Reduction based on Rough Set Theory: A Review. Applied Soft Computing 9, 1–12 (2009)
Pawlak, Z.: Rough Set Approach to Knowledge-based Decision Support. European Journal of Operational Research 99, 48–57 (1997)
Pawlak, Z., Skowron, A.: Rudiment of Rough Sets. Information Sciences 177, 3–27 (2007)
Skowron, A.: Rough Sets in KDD (2000)
Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29, 103–130 (1997)
Hassan, S.Z., Verma, B.: A Hybrid Data Mining Approach for Knowledge Extraction and Classification in Medical Databases. In: IEEE Seventh International Conference on Intelligent Systems Design and Applications, pp. 503–508 (2007)
Pattaraintakorn, P., Cercone, N.: Integrating Rough Set Theory and Medical Applications. Applied Mathematics Letters 21, 400–403 (2008)
Li, R., Zhao, Y., Zhang, F., Song, L.: Rough Sets in Hybrid Soft Computing Systems. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 35–44. Springer, Heidelberg (2007)
Kohavi, R.: Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 202–207 (1996)
Zheng, Z., Webb, G.I.: Lazy Learning of Bayesian Rules. Machine Learning 41, 53–84 (2000)
Ratanamahatana, C.A., Gunopulos, D.: Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection. In: proceedings of Workshop on Data Cleaning and Preprocessing (DCAP 2002), at IEEE International Conference on Data Mining, ICDM 2002 (2002)
Frank, E., Hall, M., Pfahringer, B.: Locally Weighted Naive Bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 249–256. Morgan Kaufmann, Seattle (2003)
Webb, G.I., Boughton, J., Wang, Z.: Not so Naive Bayes: Aggregating One-Dependence Estimators. Machine Learning 58, 5–24 (2005)
Ji, Y., Shang, L.: RoughTree: A Classifier with Naïve-Bayes and Rough Sets Hybrid in Decision Tree Representation. In: 2007 IEEE International Conference on Granular Computing, pp. 221–226 (2007)
Hall, M.: A Decision Tree-Based Attribute Weighting Filtering for Naïve Bayes. Knowledge-Based Systems 20, 120–126 (2007)
Zhang, H., Sheng, S.: Learning Weighted Naïve Bayes with Accurate Ranking. In: Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM 2004 (2004)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
The ROSETTA rough set toolkit, http://www.lcb.uu.se/tools/rosetta/
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, Department of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Kohavi, R.: A Study of Cross-validation and Bootstrap for Accuracy Estimation and Model Selection. In: International joint Conference on artificial intelligence, pp. 1137–1145 (1995)
Ziarko, W.: Variable Precision Rough Set Model. Journal of Computer and System Sciences 46, 39–59 (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Al-Aidaroos, K., Bakar, A.A., Othman, Z. (2010). Data Classification Using Rough Sets and Naïve Bayes. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds) Rough Set and Knowledge Technology. RSKT 2010. Lecture Notes in Computer Science(), vol 6401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16248-0_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-16248-0_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16247-3
Online ISBN: 978-3-642-16248-0
eBook Packages: Computer ScienceComputer Science (R0)