Markov Chain Monte Carlo Methods for Hierarchical Bayesian Expert Systems

  • Jeremy C. York
  • David Madigan
Conference paper
Part of the Lecture Notes in Statistics book series (LNS, volume 89)


In a hierarchical Bayesian expert system, the probabilities relating the variables are not known precisely; rather, imprecise knowledge of these probabilities is described by placing prior distributions on them. After obtaining data, one would like to update those distributions to reflect the new information gained; however, this can prove difficult computationally if the observed data are incomplete. This paper describes a way around these difficulties—use of Markov chain Monte Carlo methods.


Mellon Papilledema 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Cooper 84]
    Cooper, G. F. (1984) NESTOR: A computer-based medical diagnostic aid that integrates causal and probabilistic knowledge, PhD Thesis, Dept of Computer Science, Stanford University.Google Scholar
  2. [Cooper 92]
    Cooper, G. F. (1992) Personal communication.Google Scholar
  3. [DawiLaur 89]
    Dawid, A.P. and Lauritzen, S.L. (1989)“Markov Distributions, Hyper-Markov Laws and Meta-Markov Models on Decomposable Graphs, with Applications to Bayesian Learning in Expert Systems,” Technical Report R-89-31. Institute for Electronic Systems, Aalborg University.Google Scholar
  4. [DawiLaur 93]
    Dawid, A.P. and Lauritzen, S.L. (1993) “Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models,” Ann. Stat., in pressGoogle Scholar
  5. [Devroye 86]
    Devroye, Luc (1986) Non-uniform Random Variate Generation. Springer-Verlag, New York.MATHGoogle Scholar
  6. [GelfSmit 91]
    Gelfand, A. and Smith, A.F.M. (1991) “Gibbs sampling for marginal posterior expectations,” Communs. Statist. Theory Meth. 20, 1747–1766.MathSciNetCrossRefGoogle Scholar
  7. [Geyer 92]
    Geyer, Charles J. (1992) “Practical Markov Chain Monte Carlo,” Stat. Sci. 7, 473–483.CrossRefGoogle Scholar
  8. [GeyeTier 92]
    Geyer, Charles J. and Tierney, Luke (1992) “On the Convergence of Monte Carlo Approximations to the Posterior Density,” Technical Report 579. School of Statistics, University of Minnesota.Google Scholar
  9. [Hastings 70]
    Hastings, W.K. (1970) “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika 57, 97 - 109.MATHCrossRefGoogle Scholar
  10. [LauDawLarLei 90]
    Lauritzen, S.L., Dawid, A.P., Larsen, B. N. and Leimer, H.G. (1990) “Independence Properties of Directed Markov Random Fields,” Networks 20, 491 - 505.MathSciNetMATHCrossRefGoogle Scholar
  11. [LaurSpie 88]
    Lauritzen, S.L. and Spiegelhalter, D. (1988) “Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems (with Discussion),” J. Roy. Statist. Soc. ser. B 50, 157 - 224.MathSciNetMATHGoogle Scholar
  12. [MadiYork 93]
    Madigan, D. and York, Jeremy C. (1993) “Bayesian Graphical Models,” Technical Report 259. Department of Statistics, University of Washington.Google Scholar
  13. [Pearl 88]
    Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo.Google Scholar
  14. [Ripley 87]
    Ripley, B. (1987) Stochastic Simulation. John Wiley and Sons, New York.MATHCrossRefGoogle Scholar
  15. [SmitRobe 93]
    Smith, A.F.M. and Roberts, G. O. (1993) “Bayesian Computation via the Gibbs Sampler and Related Markov Chain Monte Carlo Methods,” J. R. Statist. Soc. B 55, 3–23.MathSciNetMATHGoogle Scholar
  16. [SpieCowe 92]
    Spiegelhalter, D.J. and Cowell, R.G. (1992) “Learning in Probabilistic Expert Systems,” in: Bayesian Statistics 4, Bernardo, J.M., Berger, J.O., Dawid, A.P. and Smith, A.F.M., eds, Oxford University Press, Oxford.Google Scholar
  17. [SpieLaur90]
    Spiegelhalter, D.J. and Lauritzen, S.L. (1990) “Sequential Updating of Conditional Probabilities on Directed Graphical Structures,” Networks 20, 579–605.MathSciNetMATHCrossRefGoogle Scholar
  18. [Tierney 91a]
    Tierney, Luke ( 1991 a) “Markov Chains for Exploring Posterior Distributions.” Technical Report 560, School of Statistics, University of Minnesota.Google Scholar
  19. [Tierney 91b]
    Tierney, Luke (1991b) “Exploring Posterior Distributions Using Markov Chains,” in: Computing Science and Statistics, Proceedings of the 23rd Symposium on the Interface, 563–570.Google Scholar
  20. [WeiTann 90]
    Wei, G.C.G. and Tanner, M. A. (1990) “Calculating the Content and the Boundary of the Highest Posterior Density Region via Data Augmentation”, Biometrika 77, 649–652.MathSciNetCrossRefGoogle Scholar
  21. [York 92a]
    York, Jeremy C. (1992a) “Use of the Gibbs sampler in Expert Systems,” Artificial Intelligence 56, 115 - 130.MathSciNetMATHCrossRefGoogle Scholar
  22. [York 92b]
    York, Jeremy C. (1992b) Bayesian Methods for the Analysis of Misclassified or Incomplete Multivariate Discrete Data, PhD Thesis, Dept of Statistics, University of Washington.Google Scholar
  23. [YorkMadi 92]
    “Bayesian methods for estimating the size of a closed population.” Technical report 234, Department of Statistics, University of Washington.Google Scholar
  24. [YorMadHeuLie 93]
    York, Jeremy C., Madigan, David, Heuch, Ivar and Lie, Rolv Terje (1993) “Birth Defects Registered by Double Sampling: A Bayesian Approach Incorporating Covariates and Model Uncertainty,” submitted for publication.Google Scholar

Copyright information

© Springer-Verlag New York 1994

Authors and Affiliations

  • Jeremy C. York
    • 1
  • David Madigan
    • 2
  1. 1.Dept of StatisticsCarnegie Mellon UniversityUSA
  2. 2.Dept of StatisticsUniversity of WashingtonUSA

Personalised recommendations