Widened Learning of Bayesian Network Classifiers

  • Oliver R. SampsonEmail author
  • Michael R. Berthold
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9897)


We demonstrate the application of Widening to learning performant Bayesian Networks for use as classifiers. Widening is a framework for utilizing parallel resources and diversity to find models in a hypothesis space that are potentially better than those of a standard greedy algorithm. This work demonstrates that widened learning of Bayesian Networks, using the Frobenius Norm of the networks’ graph Laplacian matrices as a distance measure, can create Bayesian networks that are better classifiers than those generated by popular Bayesian Network algorithms.


Bayesian Network Target Node Minimum Description Length Hypothesis Space Markov Blanket 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Akbar, Z., Ivanova, V.N., Berthold, M.R.: Parallel data mining revisited. better, not faster. In: Hollmén, J., Klawonn, F., Tucker, A. (eds.) IDA 2012. LNCS, vol. 7619, pp. 23–34. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-34156-4_4 CrossRefGoogle Scholar
  3. 3.
    Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: the konstanz information miner. In: Preisach, C., Burkhardt, H., Schmidt-Thieme, L., Decker, R. (eds.) Data Analysis, Machine Learning and Applications. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 319–326. Springer, Heidelberg (2007)Google Scholar
  4. 4.
    Bielza, C., Larranaga, P.: Discrete Bayesian network classifiers: a survey. ACM Comput. Surv. (CSUR) 47(1), 5 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Buntine, W.: Theory refinement on Bayesian networks. In: Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence, pp. 52–60. Morgan Kaufmann Publishers Inc., Los Angeles (1991)Google Scholar
  6. 6.
    Carvalho, A.M.: Scoring functions for learning Bayesian networks. Technical report INESC-ID Technical report 54/2009, Instituto superior Téchnico, Technical University of Lisboa, April 2009Google Scholar
  7. 7.
    Cheng, J., Bell, D.A., Liu, W.: An algorithm for Bayesian belief network construction from data. In: Proceedings of AI & STAT 1997, pp. 83–90 (1997)Google Scholar
  8. 8.
    Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2002)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9(4), 309–347 (1992)zbMATHGoogle Scholar
  10. 10.
    De Campos, C.P., Zeng, Z., Ji, Q.: Structure learning of Bayesian networks using constraints. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 113–120. ACM (2009)Google Scholar
  11. 11.
    Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)zbMATHGoogle Scholar
  12. 12.
    Erkut, E.: The discrete p-dispersion problem. Eur. J. Oper. Res. 46(1), 48–60 (1990)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Fillbrunn, A., Berthold, M.R.: Diversity-driven widening of hierarchical agglomerative clustering. In: Fromont, E., Bie, T., Leeuwen, M. (eds.) IDA 2015. LNCS, vol. 9385, pp. 84–94. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-24465-5_8 CrossRefGoogle Scholar
  14. 14.
    Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)CrossRefzbMATHGoogle Scholar
  15. 15.
    Golub, G.H., van Loan, C.F.: Matrix Computations, 4th edn. The Johns Hobpkins University Press, Baltimore (2013)zbMATHGoogle Scholar
  16. 16.
    Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1950)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)zbMATHGoogle Scholar
  18. 18.
    Ivanova, V.N., Berthold, M.R.: Diversity-driven widening. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 223–236. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-41398-8_20 CrossRefGoogle Scholar
  19. 19.
    Koski, T.J., Noble, J.M.: A review of Bayesian networks and structure learning. Math. Applicanda 40(1), 53–103 (2012)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Larrañaga, P., Karshenas, H., Bielza, C., Santana, R.: A review on evolutionary algorithms in Bayesian network learning and inference tasks. Inf. Sci. 233, 109–125 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar
  22. 22.
    Lowerre, B.T.: The HARPY speech recognition system. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA, USA (1976)Google Scholar
  23. 23.
    Maron, M.E., Kuhns, J.L.: On relevance, probabilistic indexing and information retrieval. J. ACM (JACM) 7(3), 216–244 (1960)CrossRefGoogle Scholar
  24. 24.
    Meinl, T.: Maximum-score diversity selection. Ph.D. thesis, University of Konstanz, July 2010Google Scholar
  25. 25.
    Nielsen, J.D., Kočka, T., Peña, J.M.: On local optima in learning Bayesian networks. In: Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence, pp. 435–442. Morgan Kaufmann Publishers Inc., San Francisco (2003)Google Scholar
  26. 26.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)zbMATHGoogle Scholar
  27. 27.
    Pernkopf, F.: Bayesian network classifiers versus k-NN classifier using sequential feature selection. In: AAAI, pp. 360–365 (2004)Google Scholar
  28. 28.
    Robinson, R.W.: Counting unlabeled acyclic digraphs. In: Little, C.H.C. (ed.) Combinatorial Mathematics V. LNM, vol. 622, pp. 28–43. Springer, Heidelberg (1977)CrossRefGoogle Scholar
  29. 29.
    Sampson, O., Berthold, M.R.: Widened KRIMP: better performance through diverse parallelism. In: Blockeel, H., Leeuwen, M., Vinciotti, V. (eds.) IDA 2014. LNCS, vol. 8819, pp. 276–285. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-12571-8_24 Google Scholar
  30. 30.
    Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Sierra, B., Larrañaga, P.: Predicting the survival in malignant skin melanoma using Bayesian networks. an empirical comparison between different approaches. Artif. Intell. Med. 14(1–2), 215–230 (1998)CrossRefGoogle Scholar
  32. 32.
    Sprites, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. MIT Press, Cambridge (1993)CrossRefzbMATHGoogle Scholar
  33. 33.
    Jiang, S., Zhang, H.: Full Bayesian network classifiers. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 897–904. ACM (2006)Google Scholar
  34. 34.
    Suzuki, J.: A construction of Bayesian networks from databases based on an MDL principle. In: Proceedings of the Ninth International Conference on Uncertainty in Artificial Intelligence, pp. 266–273. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  35. 35.
    Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Chair for Bioinformatics and Information Mining, Department of Computer and Information ScienceUniversity of KonstanzKonstanzGermany

Personalised recommendations