Lifetime Data Analysis

, Volume 23, Issue 1, pp 113–135 | Cite as

Empirical likelihood method for non-ignorable missing data problems

  • Zhong Guan
  • Jing Qin


Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)’s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.


Constrained estimation Empirical likelihood Non-ignorable missing data Survey sampling 



The authors would like to thank the Editor, the Guest Editor and the two referees for their careful reading and for some useful comments and suggestions that have greatly improved the original submission.


  1. Alho JM (1990) Adjusting for nonresponse bias using logistic regression. Biometrika 77:617–624MathSciNetCrossRefMATHGoogle Scholar
  2. Chan KCG, Yam SCP (2014) Oracle, multiple robust and multipurpose calibration in a missing response problem. Stat Sci 29:380–396MathSciNetCrossRefMATHGoogle Scholar
  3. Chen J, Qin J (1993) Empirical likelihood estimation for finite populations and the effective usage of auxiliary information. Biometrika 80:107–116MathSciNetCrossRefMATHGoogle Scholar
  4. Chen K (2001) Parametric models for response-biased sampling. J R Stat Soc Ser B Stat Methodol 63:775–789MathSciNetCrossRefMATHGoogle Scholar
  5. Cochran WG (1977) Sampling techniques, 3rd edn., Wiley series in probability and mathematical statisticsWiley, New YorkMATHGoogle Scholar
  6. Davidian M, Tsiatis AA, Leon S (2005) Semiparametric estimation of treatment effect in a pretest-posttest study with missing data. Stat Sci 20:261–301 with comments and a rejoinder by the authorsMathSciNetCrossRefMATHGoogle Scholar
  7. Godambe VP (1960) An optimum property of regular maximum likelihood estimation. Ann Math Stat 31:1208–1211MathSciNetCrossRefMATHGoogle Scholar
  8. Greenlees JS, Reece WS, Zieschan KD (1982) Imputation of missing values when the probability of response depends on the variable being imputed. J Am Stat Assoc 77:251–261CrossRefGoogle Scholar
  9. Hall P, Scala BL (1990) Methodology and algorithms of empirical likelihood. Int Stat Rev/Revue Internationale de Statistique 58:109–127MATHGoogle Scholar
  10. Hammer SM, Katzenstein DA, Hughes MD, Gundacker H, Schooley RT, Haubrich RH, Henry WK, Lederman MM, Phair JP, Niu M, Hirsch MS, Merigan TC (1996) A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. N Engl J Med 335:1081–1090CrossRefGoogle Scholar
  11. Han P, Wang L (2013) Estimation with missing data: beyond double robustness. Biometrika 100:417–430MathSciNetCrossRefMATHGoogle Scholar
  12. Kim JK, Im J (2014) Propensity score adjustment with several follow-ups. Biometrika 101:439–448MathSciNetCrossRefMATHGoogle Scholar
  13. Kim JK, Yu CL (2011) A semi-parametric estimation of mean functionals with non-ignorable missing data. J Am Stat Assoc 106:157–165CrossRefGoogle Scholar
  14. Li L, Shen C, Li X, Robins JM (2011) On weighting approaches for missing data. Stat Methods Med ResGoogle Scholar
  15. Liang K-Y, Qin J (2000) Regression analysis under non-standard situations: a pairwise pseudolikelihood approach. J R Stat Soc Ser B 62:773–786MathSciNetCrossRefMATHGoogle Scholar
  16. Little RJA (1982) Models for nonresponse in sample surveys. J Am Stat Assoc 77:237–250MathSciNetCrossRefMATHGoogle Scholar
  17. Little RJA, Rubin RB (2002) Statistical analysis with missing data, 2nd edn., Wiley series in probability and statisticsWiley, HobokenMATHGoogle Scholar
  18. Nevo A (2003) Using weights to adjust for sample selection when auxiliary information is available. J Bus Econ Stat 21:43–52MathSciNetCrossRefGoogle Scholar
  19. Niu C, Guo X, Xu W, Zhu L (2014) Empirical likelihood inference in linear regression with nonignorable missing response. Comput Stat Data Anal 79:91–112MathSciNetCrossRefGoogle Scholar
  20. Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75:237–249MathSciNetCrossRefMATHGoogle Scholar
  21. Owen AB (2001) Empirical likelihood. Chapman & Hall, Boca RatonCrossRefMATHGoogle Scholar
  22. Qin J, Zhang B (2007) Empirical-likelihood-based inference in missing response problems and its application in observational studies. J R Stat Soc Ser B 69:101–122MathSciNetCrossRefGoogle Scholar
  23. Rotnitzky A, Robins JM (1997) Analysis of semi-parametric regression models with non-ignorable non-response. Stat Med 16:81–102CrossRefGoogle Scholar
  24. Scharfstein DO, Rotnitzky A, Robins JM (1999) Adjusting for nonignorable drop-out using semiparametric nonresponse models. J Am Stat Assoc 94:1096–1146 with comments and a rejoinder by the authorsMathSciNetCrossRefMATHGoogle Scholar
  25. Small CG, McLeish DL (1988) Generalizations of ancillarity, completeness and sufficiency in an inference function space. Ann Stat 16:534–551MathSciNetCrossRefMATHGoogle Scholar
  26. Small CG, McLeish DL (1989) Projection as a method for increasing sensitivity and eliminating nuisance parameters. Biometrika 76:693–703MathSciNetCrossRefMATHGoogle Scholar
  27. Tan Z (2010) Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 97:661–682MathSciNetCrossRefMATHGoogle Scholar
  28. Tang CY, Leng C (2011) Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika 98:1001–1006MathSciNetCrossRefMATHGoogle Scholar
  29. Tang CY, Qin Y (2012) An efficient empirical likelihood approach for estimating equations with missing data. Biometrika 99:1001–1007MathSciNetCrossRefMATHGoogle Scholar
  30. Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90:747–764MathSciNetCrossRefMATHGoogle Scholar
  31. Vardi Y (1982) Nonparametric estimation in the presence of length bias. Ann Stat 10:616–620MathSciNetCrossRefMATHGoogle Scholar
  32. Vardi Y (1985) Empirical distributions in selection bias models. Ann Stat 13:178–205 with discussion by C. L. MallowsMathSciNetCrossRefMATHGoogle Scholar
  33. Wang Q, Dai P (2008) Semiparametric model-based inference in the presence of missing responses. Biometrika 95:721–734MathSciNetCrossRefMATHGoogle Scholar
  34. Wang S, Shao J, Kim JK (2014) An instrument variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24:1097–1116MATHGoogle Scholar
  35. Zhao P-Y, Tang M-L, Tang N-S (2013) Robust estimation of distribution functions and quantiles with non-ignorable missing data. Can J Stat 41:575–595MathSciNetCrossRefMATHGoogle Scholar
  36. Zhong P-S, Chen S (2014) Jackknife empirical likelihood inference with regression imputation and survey data. J Multivar Anal 129:193–205MathSciNetCrossRefMATHGoogle Scholar
  37. Zhou Y, Wan ATK, Wang X (2008) Estimating equations inference with missing data. J Am Stat Assoc 103:1187–1199MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Mathematical SciencesIndiana University South BendSouth BendUSA
  2. 2.Biostatistics Research BranchNational Institute of Allergy and Infectious DiseasesBethesdaUSA

Personalised recommendations