Local Structure Learning in Graphical Models

  • Christian Borgelt
  • Rudolf Kruse
Part of the International Centre for Mechanical Sciences book series (CISM, volume 472)


A topic in probabilistic network learning is to exploit local network structure, i.e., to capture regularities in the conditional probability distributions, and to learn networks with such local structure from data. In this paper we present a modification of the learning algorithm for Bayesian networks with a local decision graph representation suggested in Chickering et al. (1997), which is often more efficient. It rests on the idea to exploit the decision graph structure not only to capture a larger set of regularities than decision trees can, but also to improve the learning process. In addition, we study the influence of the properties of the evaluation measure used on the learning time and identify three classes of evaluation measures.


Bayesian Network Local Structure Graphical Model Leaf Node Information Gain 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. S.K. Andersen, K.G. Olesen, F.V. Jensen, and F. Jensen. HUGIN — A shell for building Bayesian belief universes for expert systems. Proc. 11th Int. J. Conf. on Artificial Intelligence, 1080–1085, 1989Google Scholar
  2. P.W. Bairn. A Method for Attribute Selection in Inductive Learning Systems. IEEE Trans. on Pattern Analysis and Machine Intelligence, 10: 888–896, 1988CrossRefGoogle Scholar
  3. C. Borgelt and R. Kruse. Evaluation Measures for Learning Probabilistic and Possibilistic Networks. Proc. 6th IEEE Int. Conf. on Fuzzy Systems (FUZZ-IEEE’97), Vol. 2: pp. 1034–1038, Barcelona, Spain, 1997Google Scholar
  4. C. Borgelt and R. Kruse. Some Experimental Results on Learning Probabilistic and Possibilistic Networks with Different Evaluation Measures. Proc. 1st Int. Joint Conference on Qualitative and Quantitative Practical Reasoning (ECSQARU/FAPR’97), pp. 71–85, Springer, Berlin, Germany, 1997 )Google Scholar
  5. C. Borgelt and R. Kruse. Graphical Models — Methods for Data Analysis and Mining. J. Wiley & Sons, Chichester, United Kingdom 2002MATHGoogle Scholar
  6. C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller. Context Specific Independence in Bayesian Networks. Proc. 12th Conf. on Uncertainty in Artificial Intelligence (UAI’96), Portland, OR, 1996Google Scholar
  7. L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. Classification and Regression Trees, Wadsworth International Group, Belmont, CA, 1984MATHGoogle Scholar
  8. W. Buntine. Theory Refinement on Bayesian Networks. Proc. 7th Conf. on Uncertainty in Artificial Intelligence, pp. 52–60, Morgan Kaufman, Los Angeles, CA, 1991Google Scholar
  9. D.M. Chickering, D. Heckerman, and C. Meek. A Bayesian Approach to Learning Bayesian Networks with Local Structure. Proc. 13th Conf. on Uncertainty in Artificial Intelligence (UAI’97), pp. 80–89, Morgan Kaufman, San Franscisco, CA, 1997Google Scholar
  10. C.K. Chow and C.N. Liu. Approximating Discrete Probability Distributions with Dependence Trees. IEEE Trans. on Information Theory 14 (3): 462–467, 1968MathSciNetCrossRefMATHGoogle Scholar
  11. G.F. Cooper and E. Herskovits. A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9: 309–347, 1992MATHGoogle Scholar
  12. J. Gebhardt and R. Kruse. The context model — an integrating view of vagueness and uncertainty Int. Journal of Approximate Reasoning 9: 283–314, 1993MathSciNetCrossRefMATHGoogle Scholar
  13. J. Gebhardt and R. Kruse. POSSINFER A Software Tool for Possibilistic Inference. In: D. Dubois, H. Prade, and R. Yager, eds. Fuzzy Set Methods in Information Engineering: A Guided Tour of Applications, Wiley 1995Google Scholar
  14. J. Gebhardt and R. Kruse. Learning Possibilistic Networks from Data. Proc. 5th Int. Workshop on Artificial Intelligence and Statistics, 233–244, Fort Lauderdale, 1995Google Scholar
  15. J. Gebhardt and R. Kruse. Tightest Hypertree Decompositions of Multivariate Possibility Distributions. Proc. Int. Conf. on Information Processing and Management of Uncertainty in Knowledge-based Systems, 1996Google Scholar
  16. J. Gebhardt. Learning from Data: Possibilistic Graphical Models. Habil. thesis, University of Braunschweig, Germany 1997Google Scholar
  17. D. Geiger and D. Heckerman. Advances in Probabilistic Reasoning. Proc. 7th Conf. on Uncertainty in Artificial Intelligence (UAI’91), pp. 118–126, Morgan Kaufman, San Franscisco, CA, 1997Google Scholar
  18. D. Heckerman. Probabilistic Similarity Networks. MIT Press 1991Google Scholar
  19. D. Heckerman, D. Geiger, and D.M. Chickering. Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Machine Learning 20: 197–243, 1995MATHGoogle Scholar
  20. M. Higashi and G.J. Klir. Measures of Uncertainty and Information based on Possibility Distributions. Int. Journal of General Systems 9: 43–58, 1982MathSciNetCrossRefMATHGoogle Scholar
  21. K. Kira and L. Rendell. A Practical Approach to Feature Selection. Proc. 9th Int. Conf. on Machine Learning (ICML’92), pp. 250–256, Morgan Kaufman, San Franscisco, CA, 1992Google Scholar
  22. G.J. Klir and M. Mariano. On the Uniqueness of a Possibility Measure of Uncertainty and Information. Fuzzy Sets and Systems 24: 141–160, 1987MathSciNetCrossRefMATHGoogle Scholar
  23. I. Kononenko. Estimating Attributes: Analysis and Extensions of RELIEF. Proc. 7th Europ. Conf. on Machine Learning (ECML’94), Springer, New York, NY, 1994Google Scholar
  24. I. Kononenko. On Biases in Estimating Multi-Valued Attributes. Proc. 1st Int. Conf. on Knowledge Discovery and Data Mining, 1034–1040, Montreal, 1995Google Scholar
  25. R.E. Krichevsky and V.K. Trofimov. The Performance of Universal Coding. IEEE Trans. on Information Theory, 27 (2): 199–207, 1983MathSciNetCrossRefGoogle Scholar
  26. R. Kruse, E. Schwecke, and J. Heinsohn. Uncertainty and Vagueness in Knowledge-based Systems: Numerical Methods. Series: Artificial Intelligence, Springer, Berlin 1991CrossRefGoogle Scholar
  27. R. Kruse, J. Gebhardt, and F. Klawonn. Foundations of Fuzzy Systems, John Wiley & Sons, Chichester, England 1994Google Scholar
  28. S. Kullback and R.A. Leibler. On Information and Sufficiency. Ann. Math. Statistics 22: 79–86, 1951MathSciNetCrossRefMATHGoogle Scholar
  29. S.L. Lauritzen and D.J. Spiegelhalter. Local Computations with Probabilities on Graphical Structures and Their Application to Expert Systems. Journal of the Royal Statistical Society, Series B, 2 (50): 157–224, 1988MathSciNetGoogle Scholar
  30. R. Lopez de Mantaras. A Distance-based Attribute Selection Measure for Decision Tree Induction. Machine Learning 6: 81–92, 1991CrossRefGoogle Scholar
  31. H.T. Nguyen. Using Random Sets. Information Science 34: 265–274, 1984CrossRefMATHGoogle Scholar
  32. J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (2nd edition). Morgan Kaufman, New York 1992Google Scholar
  33. D. Poole. Probabilistic Horn Abduction and Bayesian Networks. Artificial Intelligence, 64 (1): 81–129, 1993CrossRefMATHGoogle Scholar
  34. J.R. Quinlan. C.4.5: Programs for Machine Learning, Morgan Kaufman, 1993Google Scholar
  35. L.K. Rasmussen. Blood Group Determination of Danish Jersey Cattle in the F-blood Group System. Dina Research Report no. 8, 1992Google Scholar
  36. J. Rissanen. Stochastic Complexity. Journal of the Royal Statistical Society (Series B), 49: 223–239, 1987MathSciNetMATHGoogle Scholar
  37. A. Saffiotti and E. Umkehrer. PULCINELLA: A General Tool for Propagating Uncertainty in Valuation Networks. Proc. 7th Conf. on Uncertainty in AI, 323–331, San Mateo 1991Google Scholar
  38. G. Shafer and P.P. Shenoy. Local Computations in Hypertrees. Working Paper 201, School of Business, University of Kansas, Lawrence 1988Google Scholar
  39. P.P. Shenoy. Valuation-based Systems: A Framework for Managing Uncertainty in Expert Systems. Working Paper 226, School of Business, University of Kansas, Lawrence, 1991Google Scholar
  40. J.E. Smith, S. Holtzman, and J.E. Matheson. Structuring Conditional Relationships in Influence Diagrams Operations Research, 41 (2): 280–297, 1993CrossRefGoogle Scholar
  41. L. Wehenkel. On Uncertainty Measures Used for Decision Tree Induction. Proc. IPMU, 1996Google Scholar
  42. X. Zhou and T.S. Dillon. A statistical-heuristic Feature Selection Criterion for Decision Tree Induction. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13: 834–841, 1991CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Wien 2003

Authors and Affiliations

  • Christian Borgelt
    • 1
  • Rudolf Kruse
    • 1
  1. 1.Dept. of Knowledge Processing and Language EngineeringOtto-von-Guericke-University of MagdeburgMagdeburgGermany

Personalised recommendations