Abstract
In studies of whether hospital or health-center interventions can improve screening rates for mammography and Pap smears in Los Angeles County, the availability of data from multiple sources makes it possible to combine information in an effort to improve the estimation of intervention effects. Primary sources of information, namely computerized databases that record screening outcomes and some covariates on a routine basis, are supplemented by medical chart reviews that provide additional, sometimes conflicting, assessments of screening outcomes along with additional covariates. Available data can be classified in a large contingency table where, because medical charts were not reviewed for all individuals, some cases can only be classified into a certain margin as opposed to a specific cell. This paper outlines a multiple imputation approach to facilitate data analysis using the framework of Schafer (1991, 1995), which involves drawing imputations from a multinomial distribution with cell probabilities estimated from a loglinear model fitted to the incomplete contingency table. Because of the sparseness of the contingency table, a cavalier choice of a convenient prior distribution can be problematic. The completed data are then analyzed using the method of propensity score sub classification (Rosenbaum and Rubin 1984) to reflect differences in the patient populations at different hospitals or health centers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Belin, T.R., Diffendal G.J., Mack S, Rubin D.B., Schafer J.L., Zaslavsky A.M. (1993). “Hierarchical Logistic-Regression Models for Imputation of Unresolved Enumeration Status in Undercount Estimation” (with discussion), Journal of the American Statistical Association, 88, 1149–1166.
Carroll, R.J. (1989). “Covariance Analysis in Generalized Measurement Error Models”, Statistics in Medicine, 8, 1075–1093.
Clogg, C.C., Rubin, D.B., Schenker, N., Schultz, B., Weidman, L. (1991). “Multiple Imputation of Industry and Occupation Codes in Census Public-Use Samples using Bayesian Logistic Regression”, Journal of the American Statistical Association, 86, 68–78.
Cochran, W.G. (1968). “The Effectiveness of Adjustment By Sub classification in Removing Bias in Observational Studies”, Biometrics, 24, 205–213.
Cochran, W.G. (1977). Sampling Techniques, 3rd ed., New York: John Wiley.
Cochran, W.G. (1983). Planning and Analysis of Observational Studies, New York: John Wiley.
Cook, T.D., and Campbell, D.T. (1979). Quasi-Experimentation: Design and Analysis Issues for Field Settings, Boston: Houghton Mifflin.
David, M., Little, R.J.A., Samuhel, M.E., and Triest, R.K. (1986). “Alternative Methods for CPS Income Imputation”, Journal of the American Statistical Association, 81, 29–41.
Fay, R.E. (1991). “A Design-Based Perspective on Missing Data Variance”, Proceedings of the 1991 Annual Research Conference, Bureau of the Census, Washington, D.C.
Fuchs, C. (1982). “Maximum Likelihood Estimation and Model Selection in Contingency Tables with Missing Data”, Journal of the American Statistical Association, 77, 270–278.
Gelman, A., Meng, X.L., and Rubin, D.B. (1992). “Simulating the Posterior Distribution of Loglinear Contingency Table Models”, technical report, Dept. of Statistics, University of California, Berkeley.
Gelman, A., and Rubin, D.B. (1992). “Inference from Iterative Simulation Using Multiple Sequences” (with discussion), Statistical Science, 7, 457–511.
Glynn, R.J., Laird, N.M., and Rubin, D.B. (1986). “Selection Modeling Versus Mixture Modeling with Nonignorable Nonresponse”, in Drawing Inferences from Self-Selected Samples, H. Wainer, ed., New York: Springer-Verlag.
Heitjan, D.F., and Rubin, D.B. (1990). “Inference from Coarse Data via Multiple Imputation with Application to Age Heaping,” Journal of the American Statistical Association, 85, 304–314.
Meng, X.L., and Rubin, D.B. (1991). “IPF for Contingency Tables with Missing Data via the ECM Algorithm”, Proceedings of the ASA Section on Statistical Computing, 244–247.
Meng, X.L., and Rubin, D.B. (1993). “Maximum Likelihood via the ECM Algorithm: A General Framework”, Biometrika, 80, 267–278.
Mosteller, F., and Tukey, J.W. (1977). Data Analysis and Regression, Reading, MA: Addison- Wesley.
Pollock, K.H., Nichols, J.D., Brownie, C., Hines, J.E. (1990). Statistical Inference for Capture-Recapture Experiments, Bethesda, MD: Wildlife Society.
Rosenbaum, P.R., and Rubin, D.B. (1983). “The Central Role of the Propensity Score in Observational Studies for Causal Effects”, Biometrika, 70, 41–55.
Rosenbaum, P.R., and Rubin, D.B. (1984). “Reducing Bias in Observational Studies using Subclassification on the Propensity Score”, Journal of the American Statistical Association, 79, 516–524.
Rubin, D.B. (1976). “Inference and Missing Data”, Biometrika, 63, 581–592.
Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys, New York: John Wiley.
Rubin, D.B., and Schenker, N. (1991). “Multiple Imputation in Health-Care Databases: An Overview and Some Applications”, Statistics in Medicine, 10, 585–598.
Schafer, J.L. (1991). “Algorithms for Multivariate Imputation with Ignorable Nonresponse”, Ph.D. thesis, Dept. of Statistics, Harvard University.
Schafer, J.L. (1995). Analysis and Simulation of Incomplete Multivariate Data, New York: Chapman and Hall, to appear.
Tanner, M.A., and Wong, W.H. (1987). “The Calculation of Posterior Distributions by Data Augmentation” (with discussion), Journal of the American Statistical Association, 82, 528–550.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag New York, Inc.
About this paper
Cite this paper
Belin, T.R. et al. (1995). Combining Information from Multiple Sources in the Analysis of a Non-Equivalent Control Group Design. In: Gatsonis, C., Hodges, J.S., Kass, R.E., Singpurwalla, N.D. (eds) Case Studies in Bayesian Statistics, Volume II. Lecture Notes in Statistics, vol 105. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2546-1_5
Download citation
DOI: https://doi.org/10.1007/978-1-4612-2546-1_5
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-94566-8
Online ISBN: 978-1-4612-2546-1
eBook Packages: Springer Book Archive