Skip to main content

Learning Bayesian Network Structure from Incomplete Data without Any Assumption

  • Conference paper
Book cover Database Systems for Advanced Applications (DASFAA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4947))

Included in the following conference series:

Abstract

Since most real-life data contain missing values, reasoning and learning with incomplete data has become crucial in data mining and machine learning. In particular, Bayesian networks are one machine learning technique that allows for reasoning with incomplete data, but training such networks on incomplete data may be a difficult task. Many methods were thus proposed to learn Bayesian network structure from incomplete data, based on multiple structure generation and scoring of their adequacy to the dataset. However, this kind of approaches may be time-consuming. Therefore we propose an efficient dependency analysis approach that uses a redefinition of probability calculation to take incomplete records into account while learning BN structure, without generating multiple possibilities. Some experiments on well-known benchmarks are described to show the validity of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Whittaker, J.: Graphical models in applied multivariate statistics. John Wiley & Sons, Inc, Chichester (1990)

    MATH  Google Scholar 

  2. Cheng, J., Bell, D., Liu, W.: Learning belief networks from data: an information theory based approach. In: The 6th ACM International Conference on Information and Knowledge Management, pp. 207–216 (1997)

    Google Scholar 

  3. Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988)

    Google Scholar 

  4. Cowell, R.G., Dawid, A.P., Lauritzen, S.L., Spiegelhalter, D.J.: Probabilistic networks and expert systems. Statistics for engineering and information science. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  5. Cooper, G.F., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Machine Learning 9(4), 309–347 (1992)

    MATH  Google Scholar 

  6. Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L., Cowell, R.G.: Bayesian analysis in expert systems. Statistical Science 8, 219–282 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  7. Lam, W., Bacchus, F.: Learning bayesian belief networks: An approach based on the mdl principle. Computational Intelligence 10, 269–293 (1994)

    Article  Google Scholar 

  8. Heckerman, D., Geiger, D., Chickering, D.M.: Learning bayesian networks: The combination of knowledge and statistical data. Machine Learning 20(3), 197–243 (1995)

    MATH  Google Scholar 

  9. Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 462–467 (1968)

    Article  MATH  Google Scholar 

  10. Pearl, J., Verma, T.S.: A theory of inferred causation. In: Principles of Knowledge Representation and Reasoning (KR 1991), pp. 441–452 (1991)

    Google Scholar 

  11. Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. Lecture Notes in Statistics. Springer, Heidelberg (1993)

    MATH  Google Scholar 

  12. Spirtes, P., Meek, C.: Learning bayesian networks with discrete variables from data. In: 1st International Conference on Knowledge Discovery and Data Mining (KDD 1995) (1995)

    Google Scholar 

  13. Heckerman, D.: A tutorial on learning with bayesian networks. In: The NATO Advanced Study Institute on Learning in graphical models, pp. 301–354 (1998)

    Google Scholar 

  14. Lauritzen, S.L.: The em algorithm for graphical association models with missing data. Computational Statistics and Data Analysis 19, 191–201 (1995)

    Article  MATH  Google Scholar 

  15. Dempster, A.P., Laid, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society 39(1), 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  16. Chickering, D.M., Heckerman, D.: Efficient approximations for the marginal likelihood of bayesian networks with hidden variables. Machine Learning 29(2-3), 181–212 (1997)

    Article  MATH  Google Scholar 

  17. Little, R.J.A., Rubin, D.B.: Statistical analysis with missing data. John Wiley & Sons, Inc., Chichester (1987)

    MATH  Google Scholar 

  18. Friedman, N.: Learning belief networks in the presence of missing values and hidden variables. In: 14th International Conference on Machine Learning, pp. 125–133 (1997)

    Google Scholar 

  19. Friedman, N.: The bayesian structural em algorithm. In: 14th Conference on Uncertainty in Artificial Intelligence, pp. 129–138 (1998)

    Google Scholar 

  20. Leray, P., François, O.: Bayesian network structural learning and incomplete data. In: International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR 2005), pp. 33–40 (2005)

    Google Scholar 

  21. Myers, J.W., Laskey, K.B., Levitt, T.S.: Learning bayesian networks from incomplete data with stochastic search algorithms. In: 15th Conference on Uncertainty in Artificial Intelligence (UAI 1999) (1999)

    Google Scholar 

  22. Myers, J.W., Laskey, K.B., Dejong, K.: Learning bayesian networks from incomplete data using evolutionary algorithms. In: Genetic and Evolutionary Computation Conference (GECCO 1999) (1999)

    Google Scholar 

  23. Cowell, R.G.: Parameter estimation from incomplete data for bayesian networks. In: International Workshop on Artificial Intelligence and Statistics, pp. 193–196 (1999)

    Google Scholar 

  24. Ramoni, M.F., Sebastiani, P.: The use of exogenous knowledge to learn bayesian networks from incomplete databases. In: Liu, X., Cohen, P.R., R. Berthold, M. (eds.) IDA 1997. LNCS, vol. 1280, Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  25. Ramoni, M.F., Sebastiani, P.: Parameter estimation in bayesian networks from incomplete databases. Intelligent Data Analysis 2(1), 139–160 (1998)

    Article  Google Scholar 

  26. Ramoni, M.F., Sebastiani, P.: Learning bayesian networks from incomplete databases. In: 13th Conference on Uncertainty in Artificial Intelligence (UAI 1997), pp. 401–408 (1997)

    Google Scholar 

  27. Riggelsen, C., Feelders, A.J.: Learning bayesian network models from incomplete data using importance sampling. In: 10th International Workshop on Artificial Intelligence and Statistics, pp. 301–308 (2005)

    Google Scholar 

  28. Li, X., He, X., Yuan, S.: Learning bayesian networks structures from incomplete data: An efficient approach based on extended evolutionary programming. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 474–479. Springer, Heidelberg (2005)

    Google Scholar 

  29. Li, X., He, X., Yuan, S.: A new method of learning bayesian networks structures from incomplete data. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 261–266. Springer, Heidelberg (2005)

    Google Scholar 

  30. Riggelsen, C.: Learning bayesian networks from incomplete data: An efficient method for generating approximate predictive distributions. In: Jonker, W., Petković, M. (eds.) SDM 2006. LNCS, vol. 4165, Springer, Heidelberg (2006)

    Google Scholar 

  31. Ragel, A., Cremilleux, B.: Treatment of missing values for association rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 258–270. Springer, Heidelberg (1998)

    Google Scholar 

  32. Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules between Sets of Items in Large Databases. In: The ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)

    Google Scholar 

  33. Poole, D., Mackworth, A., Goebel, R.: Computational Intelligence. Oxford University Press, Oxford (1998)

    MATH  Google Scholar 

  34. Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems, 415–448 (1990)

    Google Scholar 

  35. Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The ALARM monitoring system: A case study with two probabilistic inference techniques for belief networks. In: The 2nd European Conference on Artificial Intelligence in Medicine (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jayant R. Haritsa Ramamohanarao Kotagiri Vikram Pudi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fiot, C., Saptawati, G.A.P., Laurent, A., Teisseire, M. (2008). Learning Bayesian Network Structure from Incomplete Data without Any Assumption. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78568-2_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78567-5

  • Online ISBN: 978-3-540-78568-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics