Semi-naive Bayesian Classification by Weighted Kernel Density Estimation

Chen, Lifei; Wang, Shengrui

doi:10.1007/978-3-642-35527-1_22

Lifei Chen^22,23 &
Shengrui Wang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7713))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3539 Accesses
5 Citations

Abstract

Naive Bayes is one of the popular methods for supervised classification. The attribute conditional independence assumption makes Naive Bayes efficient but adversely affects the quality of classification results in many real-world applications. In this paper, a new feature-selection based method is proposed for semi-naive Bayesian classification in order to relax the assumption. A weighted kernel density model is first proposed for Bayesian modeling, which implements a soft feature selection scheme. Then, we propose an efficient algorithm to learn an optimized set of weights for the features, by using the least squares cross-validation method for optimal bandwidth selection. Experimental studies on six real-world datasets show the effectiveness and suitability of the proposed method for efficient Bayesian classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: Data mining, inference, and prediction. Springer (2001)
Google Scholar
Seeger, M.: Bayesian modeling in machine learning: A tutorial review. Tutorial, Saarland University (2006), http://lapmal.epfl.ch/papers/bayes-review.pdf
Zheng, F., Webb, G.: A comparative study of semi-naive bayes methods in classification learning. In: Proceedings of the Australalian Data Mining Workshop, pp. 141–156 (2005)
Google Scholar
Wu, X., Kumar, V., et al.: Top 10 algorithms in data mining. Knowledge Information System 14, 1–37 (2008)
Article Google Scholar
Wu, J., Cai, Z.: Learning averaged one-dependence estimators by attribute weighting. Journal of Information and Computational Science 8, 1063–1073 (2011)
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Not so naive bayes: Aggregating one-dependence estimators. Machine Learning 58, 5–24 (2005)
Article Google Scholar
Frank, E., Hall, M., Pfahringer, B.: Locally weighted naive bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 249–256 (2003)
Google Scholar
Jiang, L., Wang, D., Cai, Z., Yan, X.: Survey of Improving Naive Bayes for Classification. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 134–145. Springer, Heidelberg (2007)
Chapter Google Scholar
Fan, J., Fan, Y.: High-dimensional classification using features annealed independence rules. The Annals of Statistics 36, 2605–2637 (2008)
Article MathSciNet MATH Google Scholar
Gartner, T., Flach, P.: Wbcsvm: Weighted bayesian classification based on support vector machines. In: Proceedings of the ICML, pp. 154–161 (2001)
Google Scholar
Langley, P., Sage, S.: Induction of selective bayesian classifiers. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 399–406 (1994)
Google Scholar
Ratanamahatana, C., Gunopulos, D.: Feature selection for the naive bayesian classifier using decision trees. Applied Artificial Intellegence 17, 475–487 (2003)
Article Google Scholar
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Article MATH Google Scholar
Lee, C., Gutierrez, F., Dou, D.: Calculating feature weights in naive bayes with kullback-leibler measure. In: Proceedings of the IEEE ICDM, pp. 1146–1151 (2011)
Google Scholar
Qi, L., Racine, J.: Nonparametric econometrics: Theory and practice. Princeton University Press (2007)
Google Scholar
Ouyang, D., Li, Q., Racine, J.: Cross-validation and the estimation of probability distributions with categorical data. Nonparametric Statistics 18, 69–100 (2006)
Article MathSciNet MATH Google Scholar
Aitchison, J., Aitken, C.: Multivariate binary discrimination by the kernel method. Biometrika 63, 413–420 (1976)
Article MathSciNet MATH Google Scholar
Chen, L., Wang, S.: Automated feature weighting in naive bayes for high-dimensional data classification. In: Proceedings of the CIKM (2012)
Google Scholar
Chen, L., Jiang, Q., Wang, S.: Model-based method for projective clustering. IEEE Transactions on Knowledge and Data Engineering 24, 1291–1305 (2012)
Article Google Scholar
John, G., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 338–345 (1995)
Google Scholar
Hall, M., Frank, E., et al.: The weka data mining software: An update. SIGKDD Explorations 11 (2009)
Google Scholar
Minka, T.: Estimating a Dirichlet distribution (2000), http://research.microsoft.com/en-us/um/people/minka/papers/dirichlet/minka-dirichlet.pdf

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Sherbrooke, Quebec, J1K 2R1, Canada
Lifei Chen & Shengrui Wang
School of Mathematics and Computer Science, Fujian Normal University, Fujian, 350108, China
Lifei Chen

Authors

Lifei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shengrui Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, Fudan University, Handan Road 220, 200433, Shanghai, China
Shuigeng Zhou
Chinese Academy of Sciences, Academy of Mathematics and Systems Science, Dongguancun East Road 55, 100190, Beijing, China
Songmao Zhang
Department of Computer Science and Engineering, University of Minnesota, Union Street SE 200, 55455, Minneapolis, MN, USA
George Karypis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, L., Wang, S. (2012). Semi-naive Bayesian Classification by Weighted Kernel Density Estimation. In: Zhou, S., Zhang, S., Karypis, G. (eds) Advanced Data Mining and Applications. ADMA 2012. Lecture Notes in Computer Science(), vol 7713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35527-1_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-35527-1_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35526-4
Online ISBN: 978-3-642-35527-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics