Advertisement

Comparison Between Suitable Priors for Additive Bayesian Networks

  • Gilles KratzerEmail author
  • Reinhard Furrer
  • Marta Pittavino
Conference paper
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 296)

Abstract

Additive Bayesian networks (ABN) are types of graphical models that extend the usual Bayesian-generalised linear model to multiple dependent variables through the factorisation of the joint probability distribution of the underlying variables. When fitting an ABN model, the choice of the prior for the parameters is of crucial importance. If an inadequate prior—like a not sufficiently informative one—is used, data separation and data sparsity may lead to issues in the model selection process. In this work we present a simulation study to compare two weakly informative priors with a strongly informative one. For the weakly informative prior, we use a zero mean Gaussian prior with a large variance, currently implemented in the R package abn. The candidate prior belongs to the Student’s t-distribution. It is specifically designed for logistic regressions. Finally, the strongly informative prior is Gaussian with a mean equal to the true parameter value and a small variance. We compare the impact of these priors on the accuracy of the learned additive Bayesian network as function of different parameters. We create a simulation study to illustrate Lindley’s paradox based on the prior choice. We then conclude by highlighting the good performance of the informative Student’s t-prior and the limited impact of Lindley’s paradox. Finally, suggestions for further developments are provided.

Keywords

Graph theory Structural search Binomial regression 

References

  1. 1.
    Chen, M., Ibrahim, J.G.: Conjugate priors for generalized linear models. Statistica Sinica 13, 461–476 (2003)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Diaconis, P., Ylvisaker, D.: Conjugate priors for exponential families. Ann. Stat. 7(2), 269–281 (1979)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Djebbari, A., Quackenbush, J.: Seeded Bayesian networks: constructing genetic networks from microarray data. BMC Syst. Biol. 2(1), 57 (2008)CrossRefGoogle Scholar
  4. 4.
    Dojer, N., Gambin, A., Mizera, A., Wilczyński, B., Tiuryn, J.: Applying dynamic Bayesian networks to perturbed gene expression data. BMC Bioinform. 7(1), 249 (2006)CrossRefGoogle Scholar
  5. 5.
    Firth, D.: Bias reduction of maximum likelihood estimates. Biometrika 80(1), 27–38 (1993)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Flesch, I., Lucas, P.J.: Markov equivalence in Bayesian networks. In: Lucas, P., Gámez, J.A., Salmerón, A. (eds.) Advances in Probabilistic Graphical Models, pp. 3–38. Springer, Berlin, Heidelberg (2007)Google Scholar
  7. 7.
    Gelman, A., Stern, H.S., Carlin, J.B., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian Data Analysis. Chapman and Hall/CRC (2013)Google Scholar
  8. 8.
    Gelman, A., Jakulin, A., Pittau, M.G., Su, Y.S.: A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat. 2(4), 1360–1383 (2008)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Gutiérrez-Peña, E., Smith, A.F.M.: Conjugate parameterizations for natural exponential families. J. Am. Stat. Assoc. 90(432), 1347–1356 (1995)MathSciNetzbMATHGoogle Scholar
  10. 10.
    Hartnack, S., Springer, S., Pittavino, M., Grimm, H.: Attitudes of Austrian veterinarians towards euthanasia in small animal practice: impacts of age and gender on views on euthanasia. BMC Vet. Res. 12(1), 26 (2016)CrossRefGoogle Scholar
  11. 11.
    Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)zbMATHGoogle Scholar
  12. 12.
    Hodges, A.P., Dai, D., Xiang, Z., Woolf, P., Xi, C., He, Y.: Bayesian network expansion identifies new ROS and biofilm regulators. PLOS One 5(3), e9513 (2010)CrossRefGoogle Scholar
  13. 13.
    Jansen, R., Yu, H., Greenbaum, D., Kluger, Y., Krogan, N.J., Chung, S., Emili, A., Snyder, M., Greenblatt, J.F., Gerstein, M.: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302(5644), 449–453 (2003)CrossRefGoogle Scholar
  14. 14.
    Koivisto, M., Sood, K.: Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 5(May), 549–573 (2004)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Kratzer, G., Pittavino, M., Ian, L.F., Lewis, I.F.: abn: an R package for modelling multivariate data using additive Bayesian networks, R package version 1.3 (2018). https://CRAN.R-project.org/package=abn
  16. 16.
    Kratzer G, Furrer R (2018) Information-Theoretic Scoring Rules to Learn Additive Bayesian Network Applied to Epidemiology. arXiv:1808.01126
  17. 17.
    Lewis, F.I.: Bayesian networks as a tool for epidemiological systems analysis. In: AIP Conference Proceedings vol. 1493, pp. 610–617 (2012)Google Scholar
  18. 18.
    Lewis, F.I., Brülisauer, F., Gunn, G.J.: Structure discovery in Bayesian networks: an analytical tool for analysing complex animal health data. Prev. Vet. Med. 100(2), 109–115 (2011)CrossRefGoogle Scholar
  19. 19.
    Lewis, F.I., McCormick, B.J.: Revealing the complexity of health determinants in resource-poor settings. Am. J. Epidemiol. 176(11), 1051–1059 (2012)CrossRefGoogle Scholar
  20. 20.
    Lewis, F.I., Ward, M.P.: Improving epidemiologic data analyses through multivariate regression modelling. Emerg. Themes Epidemiol. 10(1), 4 (2013)CrossRefGoogle Scholar
  21. 21.
    Lindley, D.V.: A statistical paradox. Biometrika 44(1/2), 187–192 (1957)CrossRefGoogle Scholar
  22. 22.
    Pitman, E.J.G.: Sufficient statistics and intrinsic accuracy. Math. Proc. Camb. Philos. Soc. 32(4), 567–579 (1936)CrossRefGoogle Scholar
  23. 23.
    Pittavino, M.: Additive Bayesian networks for multivariate data: parameter learning, model fitting and applications in veterinary epidemiology. Ph.D. thesis, University of Zurich (2016)Google Scholar
  24. 24.
    Pittavino, M., Dreyfus, A., Heuer, C., Benschop, J., Wilson, P., Collins-Emerson, J., Torgerson, P.R., Furrer, R.: Comparison between generalized linear modelling and additive Bayesian network; identification of factors associated with the incidence of antibodies against Leptospira interrogans sv Pomona in meat workers in New Zealand. Acta Trop. 173, 191–199 (2017)CrossRefGoogle Scholar
  25. 25.
    Poon, A.F.Y., Lewis, F.I., Pond, S.L.K., Frost, S.D.W.: Evolutionary interactions between N-linked glycosylation sites in the HIV-1 envelope. PLOS Comput. Biol. 3(1), e11 (2007)CrossRefGoogle Scholar
  26. 26.
    R Core Team: R: a Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2017)Google Scholar
  27. 27.
    Robinson, R.W.: Counting unlabeled acyclic digraphs. In: Little, C.H.C. (ed.) Combinatorial Mathematics V, pp. 28–43. Springer, Berlin, Heidelberg (1977)CrossRefGoogle Scholar
  28. 28.
    Sanchez-Vazquez, M.J., Nielen, M., Edwards, S.A., Gunn, G.J., Lewis, F.I.: Identifying associations between pig pathologies using a multi-dimensional machine learning methodology. BMC Vet. Res. 8(1), 151 (2012)CrossRefGoogle Scholar
  29. 29.
    Ward, M.P., Lewis, F.I.: Bayesian graphical modelling: applications in veterinary epidemiology. Prev. Vet. Med. 110(1), 1–3 (2013)CrossRefGoogle Scholar
  30. 30.
    Zorn, C.: A solution to separation in binary response models. Polit. Anal. 13(2), 157–170 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Gilles Kratzer
    • 1
    Email author
  • Reinhard Furrer
    • 2
  • Marta Pittavino
    • 3
  1. 1.Department of MathematicsUniversity of ZurichZurichSwitzerland
  2. 2.Department of Mathematics and Department of Computational ScienceUniversity of ZurichZurichSwitzerland
  3. 3.Geneva School of Economics and ManagementResearch Center for Statistics, University of GenevaGenevaSwitzerland

Personalised recommendations