Markov Chain Monte-Carlo Methods for Missing Data Under Ignorability Assumptions

  • Haresh RochaniEmail author
  • Daniel F. Linder
Part of the ICSA Book Series in Statistics book series (ICSABSS)


Missing observations are a common occurrence in public health, clinical studies and social science research. Consequences of discarding missing observations, sometimes called complete case analysis, are low statistical power and potentially biased estimates. Fully Bayesian methods using Markov Chain Monte-Carlo (MCMC) provide an alternative model-based solution to complete case analysis by treating missing values as unknown parameters. Fully Bayesian paradigms are naturally equipped to handle this situation by augmenting MCMC routines with additional layers and sampling from the full conditional distributions of the missing data, in the case of Gibbs sampling . Here we detail ideas behind the Bayesian treatment of missing data and conduct simulations to illustrate the methodology. We consider specifically Bayesian multivariate regression with missing responses and the missing covariate setting under an ignorability assumption. Applications to real datasets are provided.


Prostate Specific Antigen Behavioral Risk Factor Surveillance System Complete Case Analysis Data Augmentation Miss Data Mechanism 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Behavioral risk factor surveillance system. Retrieved July 5, 2015, from
  2. Chen, Q., Ibrahim, J. G., Chen, M. -H., & Senchaudhuri, P. (2008). Theory and inference for regression models with missing responses and covariates. Journal of multivariate analysis, 99(6), 1302–1331.Google Scholar
  3. Daniels, M. J., & Hogan, J. W. (2008). Missing data in longitudinal studies: Strategies for Bayesian modeling and sensitivity analysis. CRC Press.Google Scholar
  4. Etzioni, R., Pepe, M., Longton, G., Chengcheng, H., & Goodman, Gary. (1999). Incorporating the time dimension in receiver operating characteristic curves: A case study of prostate cancer. Medical Decision Making, 19(3), 242–251.CrossRefGoogle Scholar
  5. Geman, S., & Geman, D. (1984). Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.CrossRefzbMATHGoogle Scholar
  6. Hastings, K. W. (1970). Monte-Carlo sampling methods using markov chains and their applications. Biometrika, 57(1), 97–109.Google Scholar
  7. Ibrahim, J. G., Chen, M. -H., Lipsitz, & S. R., (1999a). Monte-Carlo em for missing covariates in parametric regression models. Biometrics, 55(2), 591–596.Google Scholar
  8. Ibrahim, J. G., Lipsitz, S. R., & Chen, M. -H. 1999b. Missing covariates in generalized linear models when the missing data mechanism is non-ignorable. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(1), 173–190.Google Scholar
  9. Ibrahim, J. G., Chen, M. -H., & Lipsitz, S. R. (2002). Bayesian methods for generalized linear models with covariates missing at random. Canadian Journal of Statistics, 30(1), 55–78.Google Scholar
  10. Lipsitz, S. R., & Ibrahim, J. G. (1996). A conditional model for incomplete covariates in parametric regression models. Biometrika, 83(4), 916–922.Google Scholar
  11. Little, R. J. A., & Rubin, D. B. (2014). Statistical analysis with missing data. Wiley.Google Scholar
  12. Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction. USA: Oxford University Press.Google Scholar
  13. Metropolis, N., & Ulam, S. (1949). The Monte-Carlo method. Journal of the American statistical association, 44(247), 335–341.MathSciNetCrossRefzbMATHGoogle Scholar
  14. Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of state calculations by fast computing machines. The journal of Chemical Physics, 21(6), 1087–1092.Google Scholar
  15. Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592.Google Scholar
  16. Satten, G. A., & Carroll, R. J. (2000). Conditional and unconditional categorical regression models with missing covariates. Biometrics, 56(2), 384–388.Google Scholar
  17. Schafer, J. L. (1997). Analysis of incomplete multivariate data. CRC press.Google Scholar
  18. Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. Journal of the American statistical Association, 82(398), 528–540.Google Scholar
  19. White, I. R., & Carlin, J. B. (2010). Bias and efficiency of multiple imputation compared with complete-case analysis for missing covariate values. Statistics in Medicine, 29(28), 2920–2931.Google Scholar
  20. Xie, F., & Paik, M. C. (1997). Multiple imputation methods for the missing covariates in generalized estimating equation. Biometrics, 1538–1546.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Department of BiostatisticsJiann-Ping Hsu College of Public Health, Georgia Southern UniversityStatesboroGeorgia
  2. 2.Department of Biostatistics and EpidemiologyMedical College of Georgia, Augusta UniversityAugustaGeorgia

Personalised recommendations