Advertisement

European Journal of Epidemiology

, Volume 30, Issue 3, pp 197–207 | Cite as

A new comparison of nested case–control and case–cohort designs and methods

  • Ryung S. Kim
METHODS

Abstract

Existing literature comparing statistical properties of nested case–control and case–cohort methods have become insufficient for present day epidemiologists. The literature has not reconciled conflicting conclusions about the standard methods. Moreover, a comparison including newly developed methods, such as inverse probability weighting methods, is needed. Two analytical methods for nested case–control studies and six methods for case–cohort studies using proportional hazards regression model were summarized and their statistical properties were compared. The answer to which design and method is more powerful was more nuanced than what was previously reported. For both nested case–control and case–cohort designs, inverse probability weighting methods were more powerful than the standard methods. However, the difference became negligible when the proportion of failure events was very low (<1 %) in the full cohort. The comparison between two designs depended on the censoring types and incidence proportion: with random censoring, nested case–control designs coupled with the inverse probability weighting method yielded the highest statistical power among all methods for both designs. With fixed censoring times, there was little difference in efficiency between two designs when inverse probability weighting methods were used; however, the standard case–cohort methods were more powerful than the conditional logistic method for nested case–control designs. As the proportion of failure events in the full cohort became smaller (<10 %), nested case–control methods outperformed all case–cohort methods and the choice of analytic methods within each design became less important. When the predictor of interest was binary, the standard case–cohort methods were often more powerful than the conditional logistic method for nested case–control designs.

Keywords

Nested case–control Case–cohort Simulation study Inverse probability weighting 

Notes

Acknowledgments

This work was supported by the National Institutes of Health Grants 1UL1RR025750-01, P30 CA01330-35; and the National Research Foundation of Korea Grant NRF-2012-S1A3A2033416. The author is deeply thankful for the constructive comments from the anonymous referees, which led to significant improvement of this work.

Conflict of interest

None.

Supplementary material

10654_2014_9974_MOESM1_ESM.tif (174 kb)
The Empirical Biases of the Estimators of β 2. The considered methods are the full cohort analysis, two nested case-control methods which are the conditional logistic approach by Thomas (1977) and the inverse probability weighting method by Samuelsen (1997), and four case–cohort methods which are the inverse probability weighting method by Binder (1992), and the methods by Prentice (1986), Self & Prentice (1988), and Barlow (1994). The average sample size n* and the average subcohort proportion π* are shown in the titles. CCH and NCC are abbreviations for case–cohort and nested case-control designs, respectively (TIFF 173 kb)
10654_2014_9974_MOESM2_ESM.tif (170 kb)
The Empirical Standard Errors of the Estimators of β 2. The empirical standard errors of β 2 estimators are shown for the full cohort analysis, two nested case-control (NCC) methods, which are the conditional logistic approach by Thomas (1977) and the inverse probability weighting method by Samuelsen (1997), and four case–cohort (CCH) methods, which are the inverse probability weighting method by Binder (1992), and the methods by Prentice (1986), Self & Prentice (1988), and Barlow (1994). The average sample size n* and the average subcohort proportion π* are shown in the titles. Only the results for N=500, 1,000 are shown (TIFF 170 kb)
10654_2014_9974_MOESM3_ESM.tif (199 kb)
Empirical Power Testing H0: β 2=0. The nominal type 1 error rate was 0.05. The empirical power of nine methods is measured: full cohort analysis, the conditional logistic approach by Thomas (1997), inverse probability weighting methods by Samuelsen (1997) coupled with approximate jackknife (AJK) variance estimator (Kim 2013), the inverse probability weighting methods by Binder (1992) coupled with AJK variance estimator, Prentice (1986), Prentice (1986) coupled with AJK variance estimator, Self & Prentice (1988), Self & Prentice coupled with AJK variance estimator (i.e. Lin & Ying 1993), and Barlow (1994). The average sample size n* and the average subcohort proportion π* are shown in the titles. CCH and NCC are abbreviations for case–cohort and nested case-control designs, respectively. Only the results for N=500, 1,000 are shown (TIFF 199 kb)
10654_2014_9974_MOESM4_ESM.tif (173 kb)
N=1,500. The empirical biases, standard errors of the estimators of β 1, the empirical power testing H0: β 1=0, and the empirical standard errors of the estimators of β 2 are shown when N=1,500 (TIFF 172 kb)

References

  1. 1.
    Thomas D. Addendum to ‘Methods of cohort analysis: appraisal by application to asbestos mining’ by Liddell FDK, McDonald JC, Thomas DC. J R Stat Soc. 1977;A140:469–91.Google Scholar
  2. 2.
    Prentice RL. A case–cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73:1–11.CrossRefGoogle Scholar
  3. 3.
    Self SG, Prentice RL. Asymptotic distribution theory and efficiency results for case–cohort studies. Ann Stat. 1988;16:64–81.CrossRefGoogle Scholar
  4. 4.
    Langholz B, Thomas D. Nested case–control and case–cohort methods of sampling from a cohort: a critical comparison. Am J Epidemiol. 1990;131:169–76.PubMedGoogle Scholar
  5. 5.
    Barlow WE, Ichikawa L, Rosner D, Izumi S. Analysis of case–cohort designs. J Clin Epidemiol. 1999;52:1165–72.CrossRefPubMedGoogle Scholar
  6. 6.
    Barlow WE. Robust variance estimation for the case–cohort design. Biometrics. 1994;50:1064–72.CrossRefPubMedGoogle Scholar
  7. 7.
    Samuelsen S. A pseudo-likelihood approach to analysis of nested case–control studies. Biometrika. 1997;84:379–94.CrossRefGoogle Scholar
  8. 8.
    Kim RS, Kaplan R. Analysis of secondary outcomes in nested case–control study designs. Stat Med. 2014;33:4215–26.Google Scholar
  9. 9.
    Kim RS. Analysis of nested case–control study designs: revisiting the inverse probability weighting method. Commun Stat Appl Methods. 2013;20:455–66.Google Scholar
  10. 10.
    Lin DY, Ying Z. Cox regression with incomplete covariate measurements. J Am Stat Assoc. 1993;88:1341–9.CrossRefGoogle Scholar
  11. 11.
    Binder DA. Fitting Cox’s proportional hazards models from survey data. Biometrika. 1992;79:139–47.CrossRefGoogle Scholar
  12. 12.
    Lin DY. On fitting Cox’s proportional hazards models to survey data. Biometrika. 2000;87:37–47.CrossRefGoogle Scholar
  13. 13.
    Anderson PK, Gill RD. Cox’s regression model for counting processes: a large sample study. Ann Stat. 1982;10:1100–20.CrossRefGoogle Scholar
  14. 14.
    Borgan O, Goldstein L, Langholz B. Methods for the analysis of sampled cohort data in the Cox proportional hazards model. Ann Stat. 1995;23:1749–78.CrossRefGoogle Scholar
  15. 15.
    Therneau TM, Li H. Computing the Cox model for case cohort designs. Lifetime Data Anal. 1999;5:99–112.CrossRefPubMedGoogle Scholar
  16. 16.
    Lin DY, Wei LJ. The robust inference for the Cox proportional hazards model. J Am Stat Assoc. 1989;84:1074–8.CrossRefGoogle Scholar
  17. 17.
    R Development Core Team. R: a language and environment for statistical computing. Vienna: R Development Core Team; 2010.Google Scholar
  18. 18.
    R code: six case-cohort and two nested case-control methods. http://missionalconsulting.com/methods/rcode-cch-ncc
  19. 19.
    Zhang H, Goldstein L. Information and asymptotic efficiency of the case–cohort sampling design in Cox’s regression model. J Multivar Anal. 2003;85:292–317.CrossRefGoogle Scholar
  20. 20.
    Goldstein L, Zhang H. Efficiency of the maximum partial likelihood estimator for nested case control sampling. Bernoulli. 2009;15:569–97.CrossRefGoogle Scholar
  21. 21.
    Wacholder J. Practical considerations in choosing between the case–cohort and NCC designs. Epidemiology. 1991;2:155–8.CrossRefPubMedGoogle Scholar
  22. 22.
    Chen KN. Generalized case–cohort sampling. J R Stat Soc Ser B (Stat Methodol). 2001;63:791–809.CrossRefGoogle Scholar
  23. 23.
    Chen KN. Statistical estimation in the proportional hazards model with risk set sampling. Ann Stat. 2004;32:1513–32.CrossRefGoogle Scholar
  24. 24.
    Chen HY. Double-semiparametric method for missing covariates in Cox regression models. J Am Stat Assoc. 2002;97:565–76.CrossRefGoogle Scholar
  25. 25.
    Scheike TH, Juul A. Maximum likelihood estimation for Cox’s regression model under nested case–control sampling. Biostatistics. 2004;5:193–206.CrossRefPubMedGoogle Scholar
  26. 26.
    Prentice RL, Williams BJ, Peterson AV. On the regression analysis of multivariate failure time data. Biometrika. 1981;68:373–9.CrossRefGoogle Scholar
  27. 27.
    Lubin JH. Case–control methods in the presence of multiple failure times and competing risks. Biometrics. 1985;41:49–54.CrossRefPubMedGoogle Scholar
  28. 28.
    Zhang H, Schaubel DE, Kalbfleisch JD. Proportional hazards regression for the analysis of clustered survival data from case–cohort studies. Biometrics. 2011;67:18–28.CrossRefPubMedGoogle Scholar
  29. 29.
    Chen F, Chen K. Case–cohort analysis of clusters of recurrent events. Lifetime Data Anal. 2014;20:1–15.CrossRefPubMedGoogle Scholar
  30. 30.
    Xue X, Xie X, Gunter M, Rohan TE, Wassertheil-Smoller S, Ho GY, et al. Testing the proportional hazards assumption in case–cohort analysis. BMC Med Res Methodol. 2013;13:1–10.CrossRefGoogle Scholar
  31. 31.
    Bellera C, MacGrogan G, Debled M, de Lara C, Brouste V, Mathoulin-Pelissier S. Variables with time-varying effects and the cox model: some statistical concepts illustrated with a prognostic factor study in breast cancer. BMC Med Res Methodol. 2010;10:1–12.CrossRefGoogle Scholar
  32. 32.
    Lu W, Liu M, Chen Y-H. Testing goodness-of-fit for the proportional hazards model based on nested case–control data. Biometrics. 2014;. doi: 10.1111/biom.12239.PubMedCentralGoogle Scholar
  33. 33.
    Ranganathan P, Pramesh CS. Censoring in survival analysis: potential for bias. Perspect Clin Res. 2012;3:40.CrossRefPubMedCentralPubMedGoogle Scholar
  34. 34.
    Meier EN. A sensitivity analysis for clinical trials with informatively censored survival endpoints. Master’s thesis, University of Washington; 2012.Google Scholar
  35. 35.
    Braekers R, Veraverbeke N. Cox’s regression model under partially informative censoring. Commun Stat Theory Methods. 2005;34:1793–811.CrossRefGoogle Scholar
  36. 36.
    Lin DY, Robins JM, Wei LJ. Comparing two failure time distributions on the presence of dependent censoring. Biometrika. 1996;83:381–93.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2014

Authors and Affiliations

  1. 1.Department of Epidemiology and Population HealthAlbert Einstein College of MedicineBronxUSA

Personalised recommendations