Skip to main content

Bayesian Methods Based on Label Semantics

  • Chapter
Uncertainty Modeling for Data Mining

Part of the book series: Advanced Topics in Science and Technology in China ((ATSTC))

  • 1292 Accesses

Abstract

In previous chapters, we have introduced the Linguistic Decision Tree model and shown how this model can be used for classification and prediction. However, for some complex problems, good probability estimations can only be obtained by deep LDTs, which have low transparency. In such cases, how can we build a model which has a good probability estimation but which uses compact LDTs? In this chapter, two hybrid learning models are proposed combining the LDT model and the fuzzy Naive Bayes classifier. In the first model, an unlabeled instance is classified according to the Bayesian estimation given a single LDT. In the second model, a set of disjoint LDTs are used as Bayesian estimators. Experimental studies show that the first new hybrid models has both better accuracy and transparency when compared to fuzzy Naive Bayes and LDTs at shallow tree depths. The second model is shown to have equivalent performance to the LDT model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Provost F., Domingos P.: Tree induction for probability-based ranking, Machine Learning, 52, pp. 199–215. (2003).

    Article  MATH  Google Scholar 

  2. Blake C., Merz C. J.: UCI machine learning repository.

    Google Scholar 

  3. Jordan M. I.: Learning in Graphical Models, MIT Press. (1999).

    Google Scholar 

  4. Elkan C., Naive bayesian learning. Technical Report No. CS97-557, Department of Computer Science and Engineering, University of California, San Diego. (1997).

    Google Scholar 

  5. Pazzani M. J.: An iterative improvement approach for the discretization of numeric attributes in Bayesian classifiers, Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining, pp. 228–233. (1995).

    Google Scholar 

  6. Yang Y., Webb G. I.: On why discretization works for naive-bayes classifiers, Proceedings of 16th Australian Joint Conference on Artificial Intelligence. (2003).

    Google Scholar 

  7. Hsu C. -N., Huang H. -J., Wong T. -T.: Implications of the Dirichlet assumption for discretization of continuous variables in Naive Bayesian classifiers, Machine Learning, 53, pp. 235–263. (2003).

    Article  MATH  Google Scholar 

  8. Zhang H., Ling C. X.: A fundamental issue of Naive Bayes, Proceedings of 2003 Canadian Artificial Intelligence Conference. (2003).

    Google Scholar 

  9. Jeffrey R. C.: The Logic of Decision, Gordon & Breach Inc., New York. (1965).

    Google Scholar 

  10. Randon N. J., Lawry J.: Classification and query evaluation using modelling with words. Information Sciences, 176, pp. 438–464, (2006).

    Article  MathSciNet  MATH  Google Scholar 

  11. Hyndman R., Akram M.: Time series Data Library. Monash University. http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/index.htm.

    Google Scholar 

  12. Gunn S. R.: Support vector machines for classification and regression. Technical Report of Departartment of Electronics and Computer Science, University of Southampton. (1998).

    Google Scholar 

  13. Konoenko I.: Semi-Naive Bayesian classifier, Proceedings of EWSL-91 6th European Workshop on Learning, Springer, pp. 206–219. (1991).

    Google Scholar 

  14. Randon N. J.: Fuzzy and Random Set Based Induction Algorithms, PhD Thesis, Department of Engineering Mathematics, University of Bristol. (2004).

    Google Scholar 

  15. Randon N. J., Lawry J.: Linguistic modelling using a semi-Naive Bayes framework, IPMU-2002, Annecy, France. (2002).

    Google Scholar 

  16. Randon N. J., Lawry J., Cluckie I. D.: Online learning for fuzzy Bayesian prediction, Soft Methods in Probability and Statistics (SMPS) Advances in Soft Computing, 6: pp. 405–412.

    Google Scholar 

  17. Quinlan J. R.: Induction of decision trees, Machine Learning, 1: pp. 81–106. (1986).

    Google Scholar 

  18. Quinlan J. R.:C4.5: Programs for Machine Learning, San Mateo: Morgan Kaufmann. (1993).

    Google Scholar 

  19. Ferri C., Flach P. A., Hernández-Orallo J.: Improving the AUC of probabilistic estimation trees, Proceedings of ECML-03, LNAI 2837, pp. 121–132. (2003).

    Google Scholar 

  20. Witten I. H., Frank E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann. (1999).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Zhejiang University Press, Hangzhou and Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Qin, Z., Tang, Y. (2014). Bayesian Methods Based on Label Semantics. In: Uncertainty Modeling for Data Mining. Advanced Topics in Science and Technology in China. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41251-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41251-6_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41250-9

  • Online ISBN: 978-3-642-41251-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics