Bayesian Methods Based on Label Semantics

Qin, Zengchang; Tang, Yongchuan

doi:10.1007/978-3-642-41251-6_6

Zengchang Qin³ &
Yongchuan Tang⁴

Part of the book series: Advanced Topics in Science and Technology in China ((ATSTC))

1292 Accesses

Abstract

In previous chapters, we have introduced the Linguistic Decision Tree model and shown how this model can be used for classification and prediction. However, for some complex problems, good probability estimations can only be obtained by deep LDTs, which have low transparency. In such cases, how can we build a model which has a good probability estimation but which uses compact LDTs? In this chapter, two hybrid learning models are proposed combining the LDT model and the fuzzy Naive Bayes classifier. In the first model, an unlabeled instance is classified according to the Bayesian estimation given a single LDT. In the second model, a set of disjoint LDTs are used as Bayesian estimators. Experimental studies show that the first new hybrid models has both better accuracy and transparency when compared to fuzzy Naive Bayes and LDTs at shallow tree depths. The second model is shown to have equivalent performance to the LDT model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Provost F., Domingos P.: Tree induction for probability-based ranking, Machine Learning, 52, pp. 199–215. (2003).
Article MATH Google Scholar
Blake C., Merz C. J.: UCI machine learning repository.
Google Scholar
Jordan M. I.: Learning in Graphical Models, MIT Press. (1999).
Google Scholar
Elkan C., Naive bayesian learning. Technical Report No. CS97-557, Department of Computer Science and Engineering, University of California, San Diego. (1997).
Google Scholar
Pazzani M. J.: An iterative improvement approach for the discretization of numeric attributes in Bayesian classifiers, Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining, pp. 228–233. (1995).
Google Scholar
Yang Y., Webb G. I.: On why discretization works for naive-bayes classifiers, Proceedings of 16th Australian Joint Conference on Artificial Intelligence. (2003).
Google Scholar
Hsu C. -N., Huang H. -J., Wong T. -T.: Implications of the Dirichlet assumption for discretization of continuous variables in Naive Bayesian classifiers, Machine Learning, 53, pp. 235–263. (2003).
Article MATH Google Scholar
Zhang H., Ling C. X.: A fundamental issue of Naive Bayes, Proceedings of 2003 Canadian Artificial Intelligence Conference. (2003).
Google Scholar
Jeffrey R. C.: The Logic of Decision, Gordon & Breach Inc., New York. (1965).
Google Scholar
Randon N. J., Lawry J.: Classification and query evaluation using modelling with words. Information Sciences, 176, pp. 438–464, (2006).
Article MathSciNet MATH Google Scholar
Hyndman R., Akram M.: Time series Data Library. Monash University. http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/index.htm.
Google Scholar
Gunn S. R.: Support vector machines for classification and regression. Technical Report of Departartment of Electronics and Computer Science, University of Southampton. (1998).
Google Scholar
Konoenko I.: Semi-Naive Bayesian classifier, Proceedings of EWSL-91 6th European Workshop on Learning, Springer, pp. 206–219. (1991).
Google Scholar
Randon N. J.: Fuzzy and Random Set Based Induction Algorithms, PhD Thesis, Department of Engineering Mathematics, University of Bristol. (2004).
Google Scholar
Randon N. J., Lawry J.: Linguistic modelling using a semi-Naive Bayes framework, IPMU-2002, Annecy, France. (2002).
Google Scholar
Randon N. J., Lawry J., Cluckie I. D.: Online learning for fuzzy Bayesian prediction, Soft Methods in Probability and Statistics (SMPS) Advances in Soft Computing, 6: pp. 405–412.
Google Scholar
Quinlan J. R.: Induction of decision trees, Machine Learning, 1: pp. 81–106. (1986).
Google Scholar
Quinlan J. R.:C4.5: Programs for Machine Learning, San Mateo: Morgan Kaufmann. (1993).
Google Scholar
Ferri C., Flach P. A., Hernández-Orallo J.: Improving the AUC of probabilistic estimation trees, Proceedings of ECML-03, LNAI 2837, pp. 121–132. (2003).
Google Scholar
Witten I. H., Frank E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann. (1999).
Google Scholar

Download references

Author information

Authors and Affiliations

Intelligent Computing and Machine Learning Lab, School of ASEE, Beihang University, Beijing, China
Prof. Zengchang Qin
College of Computer Science, Zhejiang University, Hangzhou, Zhejiang, China
Prof. Yongchuan Tang

Authors

Prof. Zengchang Qin
View author publications
You can also search for this author in PubMed Google Scholar
Prof. Yongchuan Tang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Qin, Z., Tang, Y. (2014). Bayesian Methods Based on Label Semantics. In: Uncertainty Modeling for Data Mining. Advanced Topics in Science and Technology in China. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41251-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-41251-6_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41250-9
Online ISBN: 978-3-642-41251-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics