How Good Are the Bayesian Information Criterion and the Minimum Description Length Principle for Model Selection? A Bayesian Network Analysis

Cruz-Ramírez, Nicandro; Acosta-Mesa, Héctor-Gabriel; Barrientos-Martínez, Rocío-Erandi; Nava-Fernández, Luis-Alonso

doi:10.1007/11925231_46

Nicandro Cruz-Ramírez²⁰,
Héctor-Gabriel Acosta-Mesa²⁰,
Rocío-Erandi Barrientos-Martínez²⁰ &
…
Luis-Alonso Nava-Fernández²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4293))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1325 Accesses
10 Citations

Abstract

The Bayesian Information Criterion (BIC) and the Minimum Description Length Principle (MDL) have been widely proposed as good metrics for model selection. Such scores basically include two terms: one for accuracy and the other for complexity. Their philosophy is to find a model that rightly balances these terms. However, it is surprising that both metrics do often not work very well in practice for they overfit the data. In this paper, we present an analysis of the BIC and MDL scores using the framework of Bayesian networks that supports such a claim. To this end, we carry out different tests that include the recovery of gold-standard network structures as well as the construction and evaluation of Bayesian network classifiers. Finally, based on these results, we discuss the disadvantages of both metrics and propose some future work to examine these limitations more deeply.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 239.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Heckerman, D.: A Tutorial on Learning with Bayesian Networks. In: Jordan, M.I. (ed.) Learning in Graphical Models, pp. 301–354. MIT Press, Cambridge (1998)
Google Scholar
Grunwald, P.: Tutorial on MDL. In: Grunwald, P., Myung, I.J., Pitt, M.A. (eds.) Advances in Minimum Description Length: Theory and Applications, MIT Press, Cambridge (2005)
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997)
Article MATH Google Scholar
Lam, W., Bacchus: Learning Bayesian belief networks: An approach based on the MDL principle. Computational Intelligence 10(4) (1994)
Google Scholar
Grunwald, P.: Model Selection Based on Minimum Description Length. Journal of Mathematical Psychology 44, 133–152 (2000)
Article MathSciNet Google Scholar
Suzuki, J.: Learning Bayesian Belief Networks based on the MDL principle: An efficient algorithm using the branch and bound technique. In: International Conference on Machine Learning, Bary, Italy (1996)
Google Scholar
Suzuki, J.: Learning Bayesian Belief Networks based on the Minimum Description Length Principle: Basic Properties. IEICE Transactions on Fundamentals E82-A(10), 2237–2245 (1999)
Google Scholar
Cooper, G.F.: An Overview of the Representation and Discovery of Causal Relationships using Bayesian Networks. In: Glymour, C., Cooper, G.F. (eds.) Computation, Causation & Discovery, pp. 3–62. AAAI Press / MIT Press (1999)
Google Scholar
Cooper, G.F., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9, 309–347 (1992)
MATH Google Scholar
Cheng, J.: Learning Bayesian Networks from data: An information theory based approach. In: Faculty of Informatics, University of Ulster, United Kingdom, University of Ulster: Jordanstown, United Kingdom (1998)
Google Scholar
Friedman, N., Goldszmidt, M.: Learning Bayesian Networks from Data, University of California, Berkeley and Stanford Research Institute, p. 117 (1998)
Google Scholar
Cheng, J., Bell, D.A., Liu, W.: Learning Belief Networks from Data: An Information Theory Based Approach. In: Sixth ACM International Conference on Information and Knowledge Management, ACM, New York (1997)
Google Scholar
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search. In: Berger, J., et al. (eds.), 1st edn. Lecture Notes in Statistics, vol. 81, p. 526. Springer, Heidelberg (1993)
Google Scholar
Bozdogan, H.: Akaike’s Information Criterion and Recent Developments in Information Complexity. Journal of Mathematical Psychology 44, 62–91 (2000)
Article MATH MathSciNet Google Scholar
Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian Networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)
MATH Google Scholar
Cruz-Ramirez Nicandro, N.-F.L., Gabriel, A.-M.H., Erandi, B.-M., Efrain, R.-M.J.: A Parsimonious Constraint-based Algorithm to Induce Bayesian Network Structures from Data. In: IEEE Proceedings of the Mexican International Conference on Computer Science ENC 2005, pp. 306–313. IEEE, Puebla (2005)
Google Scholar
Cheng, J., Greiner, R.: Learning Bayesian Belief Network Classifiers: Algorithms and Systems. In: Proceedings of the Canadian Conference on Artificial Intelligence (CSCSI 2001), Ottawa, Canada (2001)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons, Chichester (2001)
MATH Google Scholar
Chickering, D.M.: Learning Bayesian Networks from Data. In: Computer Science, Cognitive Systems Laboratory, University of California, Los Angeles, California, p. 172 (1996)
Google Scholar
Spiegelhalter, D.J., et al.: Bayesian Analysis in Expert Systems. Statistical Science 8(3), 219–247 (1993)
Article MATH MathSciNet Google Scholar
Norsys, http://www.norsys.com
Murphy, P.M., Aha, D.W.: UCI repository of Machine Learning Databases (1995)
Google Scholar
Kurgan, L.A., Cios, K.J.: CAIM Discretization Algorithm. IEEE Transactions on Knowledge and Data Engineering 16(2), 145–153 (2004)
Article Google Scholar
Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In: 14th International Joint Conference on Artificial Intelligence IJCAI 1995, Morgan Kaufmann, Montreal, Canada (1995a)
Google Scholar
Cheng, J., Greiner, R.: Comparing Bayesian Network Classifiers. In: Fifteenth Conference on Uncertainty in Artificial Intelligence (1999)
Google Scholar
Spirtes, P., Meek, C.: Learning Bayesian Networks with Discrete Variables from Data. In: First International Conference on Knowledge Discovery and Data Mining (1995)
Google Scholar
Singh, M., Valtorta, Marco: An Algorithm for the Construction of Bayesian Network Structures from Data. In: 9th Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, San Francisco (1993)
Google Scholar
Singh, M., Valtorta, M.: Construction of Bayesian Network Structures from Data: a Brief Survey and an Efficient Algorithm. International Journal of Approximate Reasoning 12, 111–131 (1995)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Facultad de Física e Inteligencia Artificial, Universidad Veracruzana, Sebastián Camacho 5, Col. Centro, C.P. 91000, Xalapa, Veracruz, México
Nicandro Cruz-Ramírez, Héctor-Gabriel Acosta-Mesa & Rocío-Erandi Barrientos-Martínez
Instituto de Investigaciones en Educación, Universidad Veracruzana, Diego Leño 8 esq., Morelos, Col. Centro, C.P. 91000, Xalapa, Veracruz, México
Luis-Alonso Nava-Fernández

Authors

Nicandro Cruz-Ramírez
View author publications
You can also search for this author in PubMed Google Scholar
Héctor-Gabriel Acosta-Mesa
View author publications
You can also search for this author in PubMed Google Scholar
Rocío-Erandi Barrientos-Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Luis-Alonso Nava-Fernández
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, 07738, Mexico City, México
Alexander Gelbukh
Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Luis Enrique Erro No. 1, Sta. Ma. Tonanzintla, 72840, Puebla, México
Carlos Alberto Reyes-Garcia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cruz-Ramírez, N., Acosta-Mesa, HG., Barrientos-Martínez, RE., Nava-Fernández, LA. (2006). How Good Are the Bayesian Information Criterion and the Minimum Description Length Principle for Model Selection? A Bayesian Network Analysis. In: Gelbukh, A., Reyes-Garcia, C.A. (eds) MICAI 2006: Advances in Artificial Intelligence. MICAI 2006. Lecture Notes in Computer Science(), vol 4293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11925231_46

Download citation

DOI: https://doi.org/10.1007/11925231_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49026-5
Online ISBN: 978-3-540-49058-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics