Missing Data in Meta-analysis: Strategies and Approaches

Pigott, Terri D.

doi:10.1007/978-1-4614-2278-5_7

Terri D. Pigott²

Part of the book series: Statistics for Social and Behavioral Sciences ((SSBS))

5548 Accesses
1 Citations

Abstract

This chapter provides an overview of missing data issues that can occur in a meta-analysis. Common approaches to missing data in meta-analysis are discussed. The chapter focuses on the problem of missing data in moderators of effect size. The examples demonstrate the use of maximum likelihood methods and multiple imputation, the only two methods that produce unbiased estimates under the assumption that data are missing at random. The methods discussed in this chapter are most useful in testing the sensitivity of results to missing data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Allison, P.D. 2002. Missing data. Thousand Oaks: Sage.
MATH Google Scholar
Begg, C.B., and J.A. Berlin. 1988. Publication bias: A problem in interpreting medical data (with discussion). Journal of the Royal Statistical Society Series A 151(2): 419–463.
Google Scholar
Buck, S.F. 1960. A method of estimation of missing values in multivariate data suitable for use with an electronic computer. Journal of the Royal Statistical Society Series B 22(2): 302–303.
MathSciNet MATH Google Scholar
Chan, A.-W., A. Hrobjartsson, M.T. Haahr, P.C. Gotzsche, and D.G. Altman. 2004. Empirical evidence for selective reporting of outcomes in randomized trials. Journal of the American Medical Association 291(20): 2457–2465.
Article Google Scholar
Dempster, A.P., N.M. Laird, and D.B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B 39(1): 1–38.
MathSciNet MATH Google Scholar
Duval, S. 2005. The Trim and Fill method. In Publication bias in meta-analysis: Prevention, assessment and adjustments, ed. H.R. Rothstein, A.J. Sutton, and M. Borenstein. West Sussex: Wiley.
Google Scholar
Duval, S., and R. Tweedie. 2000. Trim and fill: A simple funnel plot based method of testing and adjusting for publication bias in meta-analysis. Biometrics 56(2): 455–463.
Article MATH Google Scholar
Eagly, A.H., M.C. Johannesen-Schmidt, and M.L. van Engen. 2003. Transformational, transactional, and laissez-faire leadership styles: A meta-analysis comparing women and men. Psychological Bulletin 129(4): 569–592.
Article Google Scholar
Egger, M., G.D. Smith, M. Schneider, and C. Minder. 1997. Bias in meta-analysis detected by a simple, graphical test. British Medical Journal 315(7109): 629–634.
Article Google Scholar
Enders, C.K. 2010. Applied missing data analysis. Methodology in the Social Sciences. New York: Guilford.
Google Scholar
Fahrbach, K.R. 2001. An investigation of methods for mixed-model meta-analysis in the presence of missing data. Lansing: Michigan State University.
Google Scholar
Glasser, M. 1964. Linear regression analysis with missing observations among the independent variables. Journal of the American Statistical Association 59(307): 834–844.
Article MathSciNet Google Scholar
Hackshaw, A.K., M.R. Law, and N.J. Wald. 1997. The accumulated evidence on lung cancer and environmentaly tobacco smoke. British Medical Journal 315(7114): 980–988.
Article Google Scholar
Haitovsky, Y. 1968. Missing data in regression analysis. Journal of the royal Statistical Society Series B 30(1): 67–82.
MATH Google Scholar
Hemminki, E. 1980. Study of information submitted by drug companies to licensing authorities. British Medical Journal 280(6217): 833–836.
Article Google Scholar
Honaker, J., G. King, and M. Blackwell (2011) Amelia II: A program for missing data. http://r.iq.harvard.edu/src/contrib/
Kim, J.-O., and J. Curry. 1977. The treatment of missing data in multivariate analysis. Sociological Methods and Research 6(2): 215–240.
Article Google Scholar
Lipsey, M.W., and D.B. Wilson. 2001. Practical meta-analysis. Thousand Oaks: Sage Publications.
Google Scholar
Little, R.J.A., and D.B. Rubin. 1987. Statistical analysis with missing data. New York: Wiley.
MATH Google Scholar
Orwin, R.G., and D.S. Cordray. 1985. Effects of deficient reporting on meta-analysis: A conceptual framework and reanalysis. Psychological Bulletin 97(1): 134–147.
Article Google Scholar
Rosenthal, R. 1979. The file drawer problem and tolerance for null results. Psychological Bulletin 86(3): 638–641.
Article Google Scholar
Rothstein, H.R., A.J. Sutton, and M. Borenstein. 2005. Publication bias in meta-analysis: Prevention, Assessment and Adjustments. West Sussex: Wiley.
Book MATH Google Scholar
Rubin, D.B. 1976. Inference and missing data. Biometrika 63(3): 581–592.
Article MathSciNet MATH Google Scholar
Rubin, D.B. 1987. Multiple imputation for nonresponse in surveys. Wiley, New York, NY
Google Scholar
Schafer, J.L. 1997. Analysis of incomplete multivariate data. London: Chapman Hall.
Book MATH Google Scholar
Schafer, J.L. 1999. NORM: Multiple imputation of incomplete multivariate data under a normal model. Software for Windows. University Park: Department of Statistics, Penn State University.
Google Scholar
Schafer, J.L., and J.W. Graham. 2002. Missing data: Our view of the state of the art. Psychological Methods 7(2): 147–177.
Article Google Scholar
Shadish, W.R., L. Robinson, and C. Lu. 1999. ES: A computer program and manual for effect size calculation. St. Paul: Assessment Systems Corporation.
Google Scholar
Sirin, S.R. 2005. Socioeconomic status and academic achievement: A meta-analytic review of research. Review of Educational Research 75(3): 417–453. doi:10.3102/00346543075003417.
Article Google Scholar
Smith, M.L. 1980. Publication bias and meta-analysis. Evaluation in Education 4: 22–24.
Article Google Scholar
Sterne, J.A.C., B.J. Becker, and M. Egger. 2005. The funnel plot. In Publication bias in meta-analysis: Prevention, assessment and adjustment, ed. H.R. Rothstein, A.J. Sutton, and M. Borenstein. West Sussex: Wiley.
Google Scholar
Vevea, J.L., and C.M. Woods. 2005. Publication bias in research synthesis: Sensitivity analysis using a priori weight functions. Psychological Methods 10(4): 428–443.
Article Google Scholar
Williamson, P.R., C. Gamble, D.G. Altman, and J.L. Hutton. 2005. Outcome selection biase in meta-analysis. Statistical Methods in Medical Research 14(5): 515–524.
Article MathSciNet Google Scholar
Wilson, D.B. 2010. Practical meta-analysis effect size calculator. Campbell Collaboration. http://www.campbellcollaboration.org/resources/effect_size_input.php. Accessed 16 July 2011.
Yuan, Y.C. 2000. Multiple imputation for missing data: Concepts and new developments. http://support.sas.com/rnd/app/papers/multipleimputation.pdf. Accessed 2 April 2011.

Download references

Author information

Authors and Affiliations

School of Education, Loyola University Chicago, Chicago, IL, USA
Terri D. Pigott

Authors

Terri D. Pigott
View author publications
You can also search for this author in PubMed Google Scholar

Appendix

7.1.1 Computing Packages for Computation of the Multiple Imputation Results

There are a number of options for obtaining multiple imputation results in a meta-analysis model. Two freeware programs are available. The first is the program Norm by Schafer and available at http://www.stat.psu.edu/~jls/misoftwa.html. The Norm program runs as a stand alone program on Windows 95/98/NT. The second is a program available in R by Honaker et al. called Amelia II and available at http://gking.harvard.edu/amelia/. Schafer’s norm program was used for the example given earlier.

The program SAS includes two procedures, one for generating the multiple imputations, PROC MI, and a second for analyzing the completed data sets, PROC MIANALYZE. For obtaining the weighted regression results for meta-analysis, the SAS procedure PROC MIANALYZE will have limited utility since the standard errors of the weighted regression coefficients will need to be adjusted as detailed by Lipsey and Wilson (2001). Below is an illustration of the use of PROC MI for the leadership data.

7.1.1.1 R Programs

One program available in R for generating multiple imputations is Amelia II (Honaker et al. 2011). Directions for using the program are available at http://gking.harvard.edu/amelia/. Once the program is loaded into R, the following command was used to generate m = 5 imputed data sets.

> a.out < −amelia(leadimp, m = 5, idvars = "ID")

The imputed data sets can be saved for export into another program to complete the analyses using the command,

>write.amelia(obj = a.out, file.stem = "outdata").

where “obj” refers to the name given to the object with the imputed data sets (the result of using the command Amelia), and “file.stem” provides the name of the data sets that will be written from the program.

Table 7.12 are the weighted regression estimates for the effect size model from each imputation obtained in Amelia. The two variables missing observations are average age of subjects and percent of male leaders. There is variation among the five data sets in their estimates of the regression coefficients. This variation signals that there is some uncertainty in the data set due to missing observations.

Table 7.12 Regression estimates from each imputation generated using Amelia

Full size table

Table 7.13 provides the multiply-imputed estimates for the linear model of effect size. These estimates were combined in Excel, and are fairly consistent with the earlier multiple imputation analysis using Schafer’s Norm program. None of the coefficients are significantly different from zero.

Table 7.13 Multiply-imputed estimates from Amelia

Full size table

7.1.1.2 SAS Proc MI

The SAS procedure PROC MI provides a number of options for analyzing data with missing data. For the example illustrated in this chapter, we use the Monte Carlo Markov Chain with a single chain for the multiple imputations. We also use the EM estimates as the initial starting values for the MCMC analysis. The commands below were used with the leadership data to produce the five imputed data sets:

proc mi data = work.leader out = work.leaderimp seed = 101897;

var year ageave perlead gen2 sizeorg2 rndm2 effsize;

mcmc;

The first line of the command gives the name of the data set to use, the name of the created SAS data set with the imputations, and the seed number for the pseudo-random number generator. The second command line provides the variables to use in the imputations. Note that the effect size is included in this analysis. The third line specifies the use of Markov Chain Monte Carlo to obtain the estimates of the joint posterior distribution as described by Rubin (1987). Note that the number of imputations are not specified; the default number of imputed data sets generated is five, the number recommended by Schafer (1997).

SAS Proc MI provides a number of useful tables, including one outlining the missing data patterns and the group means for each variable within each missing data pattern. Once the imputations are generated, the procedure gives the estimates for the mean and standard error of the variables with missing data as illustrated below.

Multiple Imputation Parameter Estimates

Variable	Mean	SE	95% confidence limits		DF
Average age of sample	44.109	1.619	40.341	47.877	7.596
Percent of male leaders	65.691	2.898	59.743	71.640	26.869

Variable	Minimum	Maximum	Mu0	t for Mean = Mu0	Pr > \|t\|
Average age of sample	42.659	45.481	0	27.25	<.0001
Percent of male leaders	64.586	67.390	0	22.67	<.0001

To obtain the weighted regression results for each imputation, we use Proc Reg with weights. The command lines are shown below.

proc reg data = work.leaderimp outest = work.regout covout;

model effsize = year ageave perlead gen2 sizeorg2 rndm2;

weight wt;

by _Imputation_;

run;

The lines given above use the SAS data set generated by Proc MI, and estimate the coefficients for the effect size model using weighted regression. The results are computed for each imputation as indicated in the by statement. Table 7.14 provides the weighted regression results for each imputation.

Table 7.14 Multiple imputations generated using SAS Proc MI

Full size table

Table 7.15 gives the multiply-imputed estimates for the weighted regression results. As in the prior analyses, none of the regression coefficients were significantly different from zero.

Table 7.15 Multiply-imputed estimates generated by SAS

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pigott, T.D. (2012). Missing Data in Meta-analysis: Strategies and Approaches. In: Advances in Meta-Analysis. Statistics for Social and Behavioral Sciences. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-2278-5_7

Download citation

DOI: https://doi.org/10.1007/978-1-4614-2278-5_7
Published: 26 December 2011
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-2277-8
Online ISBN: 978-1-4614-2278-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Missing Data in Meta-analysis: Strategies and Approaches

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Appendix

Appendix

7.1.1 Computing Packages for Computation of the Multiple Imputation Results

7.1.1.1 R Programs

7.1.1.2 SAS Proc MI

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation