An Experimental Comparison of Methods for Handling Incomplete Data in Learning Parameters of Bayesian Networks

  • Agnieszka Oniśko
  • Marek J. Druzdzel
  • Hanna Wasyluk
Part of the Advances in Soft Computing book series (AINSC, volume 17)


Missing values of attributes in data sets, also referred to as incomplete data, pose difficulties in learning tasks, such as classification, data mining, or learning Bayesian network structure and its numerical parameters. Because of the predominance of incomplete data in practice, many methods have been proposed to deal with them while there are few studies that compare their performance. The Hepar II project presents an excellent opportunity to test experimentally how these methods perform on a real data set. We briefly review several popular methods for handling incomplete data and then compare them on the task of learning conditional probability distributions of a Bayesian network model, where the comparison criterion is the resulting diagnostic accuracy. While substitution of “normal” values of missing attributes seemed to perform best, we observed only a small difference in performance among the studied methods.


Bayesian Network Incomplete Data Conditional Prob Ability Distribution Bayesian Network Model Bayesian Network Structure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Silvia Acid, Luis M. de Campos, and Juan F. Huete. Estimating probability values from an incomplete dataset. International Journal of Approximate Reasoning, 27 (2): 183–204, 2001.MathSciNetCrossRefMATHGoogle Scholar
  2. 2.
    Leon Bobrowski. HEPAR: Computer system for diagnosis support and data analysis. Prace IBIB 31, Institute of Biocybernetics and Biomedical Engineering, Polish Academy of Sciences, Warsaw, Poland, 1992.Google Scholar
  3. 3.
    Robert G. Cowell, A. Philip Dawid, Steffen L. Lauritzen, and David J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer Verlag, New York, 1999.MATHGoogle Scholar
  4. 4.
    Luis M. de Campos, Juan F. Huete, and Serafin Moral: Probability Intervals: A tool for uncertain reasoning. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2: 167–196, 1994.CrossRefMATHGoogle Scholar
  5. 5.
    A. Dempster, D. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39: 1–38, 1977.MathSciNetMATHGoogle Scholar
  6. 6.
    B. L. Ford. An overview of hot-deck procedures. In Rubin D. B. Madow W. G., Olkin I., editor, Incomplete data in sample surveys, pages 185–207. Academic Press, New York, 1983.Google Scholar
  7. 7.
    K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, New York, 1972.Google Scholar
  8. 8.
    S. Geman and D. Geman. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6: 721–741, 1984.CrossRefMATHGoogle Scholar
  9. 9.
    D. Hedeker and R.D. Gibbons. Application of random-effects pattern-mixture models for missing data in longitudinal studies. Psychological Methods, 2 (1): 6478, 1997.CrossRefGoogle Scholar
  10. 10.
    R.J.A. Little. Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association, 88: 125–134, 1993.MATHGoogle Scholar
  11. 11.
    R.J.A. Little and D. B. Rubin. Statistical analysis with missing data. John Wiley and Sons, New York, 1987.MATHGoogle Scholar
  12. 12.
    R.J.A. Little and N. Schenker. Missing data. In C.C. Clogg G. Arminger and M.E. Sobel, editors, Handbook for Statistical Modeling in the Social and Behavioral Sciences, pages 39–75. New York Plenum, 1994.Google Scholar
  13. 13.
    Agnieszka Onigko, Marek J. Druzdzel, and Hanna Wasyluk. Extension of the Hepar II model to multiple-disorder diagnosis. In S.T. Wierzchon M. Klopotek, M. Michalewicz, editor, Intelligent Information Systems, Advances in Soft Computing Series, pages 303–313, Heidelberg, 2000. PhysicaVerlag (A Springer-Verlag Company).Google Scholar
  14. 14.
    Agnieszka Onigko, Marek J. Druzdzel, and Hanna Wasyluk. Learning Bayesian network parameters from small data sets: Application of Noisy-OR gates. International Journal of Approximate Reasoning, 27 (2): 165–182, 2001.CrossRefGoogle Scholar
  15. 15.
    Mark Peot and Ross Shachter. Learning from what you don’t observe. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 439–446, San Francisco, CA, 1998. Morgan Kaufmann Publishers.Google Scholar
  16. 16.
    Marco Ramoni and Paola Sebastiani. Learning conditional probabilities from incomplete data: An experimental comparison. In Proceedings of the The Seventh International Workshop on Artificial Intelligence and Statistics, pages 260–265, San Francisco, CA, 1999. Morgan Kaufmann Publishers, Inc.Google Scholar
  17. 17.
    D.B. Rubin. Inference and missing data. Biometrika, 63: 581–592, 1976.MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Hanna Wasyluk. The four year’s experience with HEPAR-computer assisted diagnostic program. In Proceedings of the Eighth World Congress on Medical Informatics (MEDINFO-95), pages 1033–1034, Vancouver, BC, July 23–27 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Agnieszka Oniśko
    • 1
  • Marek J. Druzdzel
    • 2
  • Hanna Wasyluk
    • 3
  1. 1.Faculty of Computer ScienceBiałystok University of TechnologyBiałystokPoland
  2. 2.Decision Systems Laboratory, School of Information Sciences, Intelligent Systems Program, and Center for Biomedical InformaticsUniversity of PittsburghPittsburghUSA
  3. 3.The Medical Center of Postgraduate Education, and Institute of Biocybernetics and Biomedical EngineeringPolish Academy of SciencesWarsawPoland

Personalised recommendations