Learning Bayesian Network Structure from Incomplete Data without Any Assumption

Fiot, Céline; Saptawati, G. A. Putri; Laurent, Anne; Teisseire, Maguelonne

doi:10.1007/978-3-540-78568-2_30

Céline Fiot¹,
G. A. Putri Saptawati²,
Anne Laurent¹ &
…
Maguelonne Teisseire¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4947))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1054 Accesses
1 Citations

Abstract

Since most real-life data contain missing values, reasoning and learning with incomplete data has become crucial in data mining and machine learning. In particular, Bayesian networks are one machine learning technique that allows for reasoning with incomplete data, but training such networks on incomplete data may be a difficult task. Many methods were thus proposed to learn Bayesian network structure from incomplete data, based on multiple structure generation and scoring of their adequacy to the dataset. However, this kind of approaches may be time-consuming. Therefore we propose an efficient dependency analysis approach that uses a redefinition of probability calculation to take incomplete records into account while learning BN structure, without generating multiple possibilities. Some experiments on well-known benchmarks are described to show the validity of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Whittaker, J.: Graphical models in applied multivariate statistics. John Wiley & Sons, Inc, Chichester (1990)
MATH Google Scholar
Cheng, J., Bell, D., Liu, W.: Learning belief networks from data: an information theory based approach. In: The 6th ACM International Conference on Information and Knowledge Management, pp. 207–216 (1997)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988)
Google Scholar
Cowell, R.G., Dawid, A.P., Lauritzen, S.L., Spiegelhalter, D.J.: Probabilistic networks and expert systems. Statistics for engineering and information science. Springer, Heidelberg (1999)
MATH Google Scholar
Cooper, G.F., Herskovits, E.: A bayesian method for the induction of probabilistic networks from data. Machine Learning 9(4), 309–347 (1992)
MATH Google Scholar
Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L., Cowell, R.G.: Bayesian analysis in expert systems. Statistical Science 8, 219–282 (1993)
Article MATH MathSciNet Google Scholar
Lam, W., Bacchus, F.: Learning bayesian belief networks: An approach based on the mdl principle. Computational Intelligence 10, 269–293 (1994)
Article Google Scholar
Heckerman, D., Geiger, D., Chickering, D.M.: Learning bayesian networks: The combination of knowledge and statistical data. Machine Learning 20(3), 197–243 (1995)
MATH Google Scholar
Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 462–467 (1968)
Article MATH Google Scholar
Pearl, J., Verma, T.S.: A theory of inferred causation. In: Principles of Knowledge Representation and Reasoning (KR 1991), pp. 441–452 (1991)
Google Scholar
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. Lecture Notes in Statistics. Springer, Heidelberg (1993)
MATH Google Scholar
Spirtes, P., Meek, C.: Learning bayesian networks with discrete variables from data. In: 1st International Conference on Knowledge Discovery and Data Mining (KDD 1995) (1995)
Google Scholar
Heckerman, D.: A tutorial on learning with bayesian networks. In: The NATO Advanced Study Institute on Learning in graphical models, pp. 301–354 (1998)
Google Scholar
Lauritzen, S.L.: The em algorithm for graphical association models with missing data. Computational Statistics and Data Analysis 19, 191–201 (1995)
Article MATH Google Scholar
Dempster, A.P., Laid, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society 39(1), 1–38 (1977)
MATH MathSciNet Google Scholar
Chickering, D.M., Heckerman, D.: Efficient approximations for the marginal likelihood of bayesian networks with hidden variables. Machine Learning 29(2-3), 181–212 (1997)
Article MATH Google Scholar
Little, R.J.A., Rubin, D.B.: Statistical analysis with missing data. John Wiley & Sons, Inc., Chichester (1987)
MATH Google Scholar
Friedman, N.: Learning belief networks in the presence of missing values and hidden variables. In: 14th International Conference on Machine Learning, pp. 125–133 (1997)
Google Scholar
Friedman, N.: The bayesian structural em algorithm. In: 14th Conference on Uncertainty in Artificial Intelligence, pp. 129–138 (1998)
Google Scholar
Leray, P., François, O.: Bayesian network structural learning and incomplete data. In: International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR 2005), pp. 33–40 (2005)
Google Scholar
Myers, J.W., Laskey, K.B., Levitt, T.S.: Learning bayesian networks from incomplete data with stochastic search algorithms. In: 15th Conference on Uncertainty in Artificial Intelligence (UAI 1999) (1999)
Google Scholar
Myers, J.W., Laskey, K.B., Dejong, K.: Learning bayesian networks from incomplete data using evolutionary algorithms. In: Genetic and Evolutionary Computation Conference (GECCO 1999) (1999)
Google Scholar
Cowell, R.G.: Parameter estimation from incomplete data for bayesian networks. In: International Workshop on Artificial Intelligence and Statistics, pp. 193–196 (1999)
Google Scholar
Ramoni, M.F., Sebastiani, P.: The use of exogenous knowledge to learn bayesian networks from incomplete databases. In: Liu, X., Cohen, P.R., R. Berthold, M. (eds.) IDA 1997. LNCS, vol. 1280, Springer, Heidelberg (1997)
Chapter Google Scholar
Ramoni, M.F., Sebastiani, P.: Parameter estimation in bayesian networks from incomplete databases. Intelligent Data Analysis 2(1), 139–160 (1998)
Article Google Scholar
Ramoni, M.F., Sebastiani, P.: Learning bayesian networks from incomplete databases. In: 13th Conference on Uncertainty in Artificial Intelligence (UAI 1997), pp. 401–408 (1997)
Google Scholar
Riggelsen, C., Feelders, A.J.: Learning bayesian network models from incomplete data using importance sampling. In: 10th International Workshop on Artificial Intelligence and Statistics, pp. 301–308 (2005)
Google Scholar
Li, X., He, X., Yuan, S.: Learning bayesian networks structures from incomplete data: An efficient approach based on extended evolutionary programming. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 474–479. Springer, Heidelberg (2005)
Google Scholar
Li, X., He, X., Yuan, S.: A new method of learning bayesian networks structures from incomplete data. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 261–266. Springer, Heidelberg (2005)
Google Scholar
Riggelsen, C.: Learning bayesian networks from incomplete data: An efficient method for generating approximate predictive distributions. In: Jonker, W., Petković, M. (eds.) SDM 2006. LNCS, vol. 4165, Springer, Heidelberg (2006)
Google Scholar
Ragel, A., Cremilleux, B.: Treatment of missing values for association rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds.) PAKDD 1998. LNCS, vol. 1394, pp. 258–270. Springer, Heidelberg (1998)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules between Sets of Items in Large Databases. In: The ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)
Google Scholar
Poole, D., Mackworth, A., Goebel, R.: Computational Intelligence. Oxford University Press, Oxford (1998)
MATH Google Scholar
Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems, 415–448 (1990)
Google Scholar
Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The ALARM monitoring system: A case study with two probabilistic inference techniques for belief networks. In: The 2nd European Conference on Artificial Intelligence in Medicine (1989)
Google Scholar

Download references

Author information

Authors and Affiliations

LIRMM, Univ. Montpellier II, CNRS, 161 rue Ada, 34392, Montpellier, France
Céline Fiot, Anne Laurent & Maguelonne Teisseire
Institut Teknologi Bandung, Jl. Ganesha 10, Bandung, 40132, Indonesia
G. A. Putri Saptawati

Authors

Céline Fiot
View author publications
You can also search for this author in PubMed Google Scholar
G. A. Putri Saptawati
View author publications
You can also search for this author in PubMed Google Scholar
Anne Laurent
View author publications
You can also search for this author in PubMed Google Scholar
Maguelonne Teisseire
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jayant R. Haritsa Ramamohanarao Kotagiri Vikram Pudi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fiot, C., Saptawati, G.A.P., Laurent, A., Teisseire, M. (2008). Learning Bayesian Network Structure from Incomplete Data without Any Assumption. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_30

Download citation

DOI: https://doi.org/10.1007/978-3-540-78568-2_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78567-5
Online ISBN: 978-3-540-78568-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics