Skip to main content

How Good Are the Bayesian Information Criterion and the Minimum Description Length Principle for Model Selection? A Bayesian Network Analysis

  • Conference paper
Book cover MICAI 2006: Advances in Artificial Intelligence (MICAI 2006)

Abstract

The Bayesian Information Criterion (BIC) and the Minimum Description Length Principle (MDL) have been widely proposed as good metrics for model selection. Such scores basically include two terms: one for accuracy and the other for complexity. Their philosophy is to find a model that rightly balances these terms. However, it is surprising that both metrics do often not work very well in practice for they overfit the data. In this paper, we present an analysis of the BIC and MDL scores using the framework of Bayesian networks that supports such a claim. To this end, we carry out different tests that include the recovery of gold-standard network structures as well as the construction and evaluation of Bayesian network classifiers. Finally, based on these results, we discuss the disadvantages of both metrics and propose some future work to examine these limitations more deeply.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 239.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Heckerman, D.: A Tutorial on Learning with Bayesian Networks. In: Jordan, M.I. (ed.) Learning in Graphical Models, pp. 301–354. MIT Press, Cambridge (1998)

    Google Scholar 

  2. Grunwald, P.: Tutorial on MDL. In: Grunwald, P., Myung, I.J., Pitt, M.A. (eds.) Advances in Minimum Description Length: Theory and Applications, MIT Press, Cambridge (2005)

    Google Scholar 

  3. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)

    Article  MATH  Google Scholar 

  4. Lam, W., Bacchus: Learning Bayesian belief networks: An approach based on the MDL principle. Computational Intelligence 10(4) (1994)

    Google Scholar 

  5. Grunwald, P.: Model Selection Based on Minimum Description Length. Journal of Mathematical Psychology 44, 133–152 (2000)

    Article  MathSciNet  Google Scholar 

  6. Suzuki, J.: Learning Bayesian Belief Networks based on the MDL principle: An efficient algorithm using the branch and bound technique. In: International Conference on Machine Learning, Bary, Italy (1996)

    Google Scholar 

  7. Suzuki, J.: Learning Bayesian Belief Networks based on the Minimum Description Length Principle: Basic Properties. IEICE Transactions on Fundamentals E82-A(10), 2237–2245 (1999)

    Google Scholar 

  8. Cooper, G.F.: An Overview of the Representation and Discovery of Causal Relationships using Bayesian Networks. In: Glymour, C., Cooper, G.F. (eds.) Computation, Causation & Discovery, pp. 3–62. AAAI Press / MIT Press (1999)

    Google Scholar 

  9. Cooper, G.F., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9, 309–347 (1992)

    MATH  Google Scholar 

  10. Cheng, J.: Learning Bayesian Networks from data: An information theory based approach. In: Faculty of Informatics, University of Ulster, United Kingdom, University of Ulster: Jordanstown, United Kingdom (1998)

    Google Scholar 

  11. Friedman, N., Goldszmidt, M.: Learning Bayesian Networks from Data, University of California, Berkeley and Stanford Research Institute, p. 117 (1998)

    Google Scholar 

  12. Cheng, J., Bell, D.A., Liu, W.: Learning Belief Networks from Data: An Information Theory Based Approach. In: Sixth ACM International Conference on Information and Knowledge Management, ACM, New York (1997)

    Google Scholar 

  13. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search. In: Berger, J., et al. (eds.), 1st edn. Lecture Notes in Statistics, vol. 81, p. 526. Springer, Heidelberg (1993)

    Google Scholar 

  14. Bozdogan, H.: Akaike’s Information Criterion and Recent Developments in Information Complexity. Journal of Mathematical Psychology 44, 62–91 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  15. Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian Networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)

    MATH  Google Scholar 

  16. Cruz-Ramirez Nicandro, N.-F.L., Gabriel, A.-M.H., Erandi, B.-M., Efrain, R.-M.J.: A Parsimonious Constraint-based Algorithm to Induce Bayesian Network Structures from Data. In: IEEE Proceedings of the Mexican International Conference on Computer Science ENC 2005, pp. 306–313. IEEE, Puebla (2005)

    Google Scholar 

  17. Cheng, J., Greiner, R.: Learning Bayesian Belief Network Classifiers: Algorithms and Systems. In: Proceedings of the Canadian Conference on Artificial Intelligence (CSCSI 2001), Ottawa, Canada (2001)

    Google Scholar 

  18. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons, Chichester (2001)

    MATH  Google Scholar 

  19. Chickering, D.M.: Learning Bayesian Networks from Data. In: Computer Science, Cognitive Systems Laboratory, University of California, Los Angeles, California, p. 172 (1996)

    Google Scholar 

  20. Spiegelhalter, D.J., et al.: Bayesian Analysis in Expert Systems. Statistical Science 8(3), 219–247 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  21. Norsys, http://www.norsys.com

  22. Murphy, P.M., Aha, D.W.: UCI repository of Machine Learning Databases (1995)

    Google Scholar 

  23. Kurgan, L.A., Cios, K.J.: CAIM Discretization Algorithm. IEEE Transactions on Knowledge and Data Engineering 16(2), 145–153 (2004)

    Article  Google Scholar 

  24. Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In: 14th International Joint Conference on Artificial Intelligence IJCAI 1995, Morgan Kaufmann, Montreal, Canada (1995a)

    Google Scholar 

  25. Cheng, J., Greiner, R.: Comparing Bayesian Network Classifiers. In: Fifteenth Conference on Uncertainty in Artificial Intelligence (1999)

    Google Scholar 

  26. Spirtes, P., Meek, C.: Learning Bayesian Networks with Discrete Variables from Data. In: First International Conference on Knowledge Discovery and Data Mining (1995)

    Google Scholar 

  27. Singh, M., Valtorta, Marco: An Algorithm for the Construction of Bayesian Network Structures from Data. In: 9th Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  28. Singh, M., Valtorta, M.: Construction of Bayesian Network Structures from Data: a Brief Survey and an Efficient Algorithm. International Journal of Approximate Reasoning 12, 111–131 (1995)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cruz-Ramírez, N., Acosta-Mesa, HG., Barrientos-Martínez, RE., Nava-Fernández, LA. (2006). How Good Are the Bayesian Information Criterion and the Minimum Description Length Principle for Model Selection? A Bayesian Network Analysis. In: Gelbukh, A., Reyes-Garcia, C.A. (eds) MICAI 2006: Advances in Artificial Intelligence. MICAI 2006. Lecture Notes in Computer Science(), vol 4293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11925231_46

Download citation

  • DOI: https://doi.org/10.1007/11925231_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49026-5

  • Online ISBN: 978-3-540-49058-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics