Data Classification Using Rough Sets and Naïve Bayes

Al-Aidaroos, Khadija; Bakar, Azuraliza Abu; Othman, Zalinda

doi:10.1007/978-3-642-16248-0_23

Khadija Al-Aidaroos²⁴,
Azuraliza Abu Bakar²⁴ &
Zalinda Othman²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6401))

Included in the following conference series:

International Conference on Rough Sets and Knowledge Technology

982 Accesses
3 Citations

Abstract

Naïve Bayesian classifier is one of the most effective and efficient classification algorithms. The elegant simplicity and apparent accuracy of naive Bayes (NB) even when the independence assumption is violated, fosters the on-going interest in the model. Rough Sets Theory has been used for different tasks in knowledge discovery and successfully applied in many real-life problems. In this study we make use of rough sets ability, in discovering attributes dependencies, to overcome the NB un-practical assumption. We propose a new algorithm called Rough-Naive Bayes (RNB) that is expected to outperform other current NB variants. RNB is based on adjusting attributes’ weights based on their dependencies and contribution to the final decision. Experimental results show that RNB can achieve better performance than NB classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)
Article MATH Google Scholar
Rish, I.: An Empirical Study of the Naïve Bayes Classifier. In: Proceedings of the Int. Joint Conf. on Artificial Intelligence, Workshop on Empirical Methods in AI (2001)
Google Scholar
Al-Aidaroos, K.M., Bakar, A.A., Othman, Z.: Naïve Bayes Variants in Classification Learning. In: Int. conf. on Information Retrieval and Knowledge Management, pp. 276–281 (2010)
Google Scholar
Pawlak, Z.: Rough Sets and Intelligent Data Analysis. Information Sciences 147, 1–12 (2002)
Article MATH MathSciNet Google Scholar
Thangavel, K., Pethalakshmi, A.: Dimensionality Reduction based on Rough Set Theory: A Review. Applied Soft Computing 9, 1–12 (2009)
Article Google Scholar
Pawlak, Z.: Rough Set Approach to Knowledge-based Decision Support. European Journal of Operational Research 99, 48–57 (1997)
Article MATH Google Scholar
Pawlak, Z., Skowron, A.: Rudiment of Rough Sets. Information Sciences 177, 3–27 (2007)
Article MATH MathSciNet Google Scholar
Skowron, A.: Rough Sets in KDD (2000)
Google Scholar
Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29, 103–130 (1997)
Article MATH Google Scholar
Hassan, S.Z., Verma, B.: A Hybrid Data Mining Approach for Knowledge Extraction and Classification in Medical Databases. In: IEEE Seventh International Conference on Intelligent Systems Design and Applications, pp. 503–508 (2007)
Google Scholar
Pattaraintakorn, P., Cercone, N.: Integrating Rough Set Theory and Medical Applications. Applied Mathematics Letters 21, 400–403 (2008)
Article MathSciNet Google Scholar
Li, R., Zhao, Y., Zhang, F., Song, L.: Rough Sets in Hybrid Soft Computing Systems. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 35–44. Springer, Heidelberg (2007)
Chapter Google Scholar
Kohavi, R.: Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), pp. 202–207 (1996)
Google Scholar
Zheng, Z., Webb, G.I.: Lazy Learning of Bayesian Rules. Machine Learning 41, 53–84 (2000)
Article Google Scholar
Ratanamahatana, C.A., Gunopulos, D.: Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection. In: proceedings of Workshop on Data Cleaning and Preprocessing (DCAP 2002), at IEEE International Conference on Data Mining, ICDM 2002 (2002)
Google Scholar
Frank, E., Hall, M., Pfahringer, B.: Locally Weighted Naive Bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 249–256. Morgan Kaufmann, Seattle (2003)
Google Scholar
Webb, G.I., Boughton, J., Wang, Z.: Not so Naive Bayes: Aggregating One-Dependence Estimators. Machine Learning 58, 5–24 (2005)
Article MATH Google Scholar
Ji, Y., Shang, L.: RoughTree: A Classifier with Naïve-Bayes and Rough Sets Hybrid in Decision Tree Representation. In: 2007 IEEE International Conference on Granular Computing, pp. 221–226 (2007)
Google Scholar
Hall, M.: A Decision Tree-Based Attribute Weighting Filtering for Naïve Bayes. Knowledge-Based Systems 20, 120–126 (2007)
Article Google Scholar
Zhang, H., Sheng, S.: Learning Weighted Naïve Bayes with Accurate Ranking. In: Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM 2004 (2004)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Google Scholar
The ROSETTA rough set toolkit, http://www.lcb.uu.se/tools/rosetta/
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, Department of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Kohavi, R.: A Study of Cross-validation and Bootstrap for Accuracy Estimation and Model Selection. In: International joint Conference on artificial intelligence, pp. 1137–1145 (1995)
Google Scholar
Ziarko, W.: Variable Precision Rough Set Model. Journal of Computer and System Sciences 46, 39–59 (1993)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Center for Artificial Intelligence Technology (CAIT), Faculty of Information and Science Technology, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
Khadija Al-Aidaroos, Azuraliza Abu Bakar & Zalinda Othman

Authors

Khadija Al-Aidaroos
View author publications
You can also search for this author in PubMed Google Scholar
Azuraliza Abu Bakar
View author publications
You can also search for this author in PubMed Google Scholar
Zalinda Othman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, 100044, Beijing, China
Jian Yu
Faculty of Economics, University of Catania, Corso Italia, 55, 95129, Catania, Italy
Salvatore Greco
Department of Mathematics and Computing Science, Saint Mary’s University, B3H 3C3, Halifax, Nova Scotia, Canada
Pawan Lingras
Institute of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
Guoyin Wang
Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Al-Aidaroos, K., Bakar, A.A., Othman, Z. (2010). Data Classification Using Rough Sets and Naïve Bayes. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds) Rough Set and Knowledge Technology. RSKT 2010. Lecture Notes in Computer Science(), vol 6401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16248-0_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-16248-0_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16247-3
Online ISBN: 978-3-642-16248-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics