Skip to main content
Log in

Nested case–control studies: should one break the matching?

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

In a nested case–control study, controls are selected for each case from the individuals who are at risk at the time at which the case occurs. We say that the controls are matched on study time. To adjust for possible confounding, it is common to match on other variables as well. The standard analysis of nested case–control data is based on a partial likelihood which compares the covariates of each case to those of its matched controls. It has been suggested that one may break the matching of nested case–control data and analyse them as case–cohort data using an inverse probability weighted (IPW) pseudo likelihood. Further, when some covariates are available for all individuals in the cohort, multiple imputation (MI) makes it possible to use all available data in the cohort. In the paper we review the standard method and the IPW and MI approaches, and compare their performance using simulations that cover a range of scenarios, including one and two endpoints.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aalen OO, Borgan Ø, Gjessing HK (2008) Survival and event history analysis: a process point of view. Springer, New York

    Book  Google Scholar 

  • Andersen PK, Gill RD (1982) Cox’s regression model for counting processes: a large sample study. Ann Stat 10:1100–1120

    Article  MATH  MathSciNet  Google Scholar 

  • Bartlett JW, Seaman SR, White IR, Carpenter JR (2014) Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model. Stat Methods Med Res. doi:10.1177/0962280214521348

  • Borgan Ø, Samuelsen SO (2013) Nested case–control and case–cohort studies. In: Klein JP, van Houwelingen HC, Ibrahim JG, Scheike TH (eds) Handbook of survival analysis. Chapman and Hall/CRC Press, Boca Raton, Florida, pp 343–367

    Google Scholar 

  • Borgan Ø, Goldstein L, Langholz B (1995) Methods for the analysis of sampled cohort data in the Cox proportional hazards model. Ann Stat 23:1749–1778

    Article  MATH  MathSciNet  Google Scholar 

  • Breslow NE (1996) Statistics in epidemiology: the case–control study. J American Stat Assoc 91:14–28

    Article  MATH  MathSciNet  Google Scholar 

  • Carpenter JR, Kenward MG (2013) Multiple imputation and its aplication. Wiley, New York

    Book  Google Scholar 

  • Chen K (2001) Generalized case–cohort estimation. J R Stat Soc Ser B 63:791–809

    Article  MATH  Google Scholar 

  • Kalbfleisch JD, Prentice RL (2002) The statistical analysis of failure time data, 2nd edn. Wiley, Hoboken

    Book  MATH  Google Scholar 

  • Keogh RH, Cox DR (2014) Case–control studies. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Keogh RH, White IR (2013) Using full-cohort data in nested case–control and case–cohort studies by multiple imputation. Stat Med 32:4021–4043

    Article  MathSciNet  Google Scholar 

  • Langholz B, Borgan Ø (1995) Counter-matching: a stratified nested case–control sampling method. Biometrika 82:69–79

    Article  MATH  Google Scholar 

  • Meng X (1994) Multiple-imputation inferences with uncongenial sources of input. Stat Sci 9:538–558

    Google Scholar 

  • Oakes D (1981) Survival times: aspects of partial likelihood (with discussion). Int Stat Rev 49:235–264

    Article  MATH  MathSciNet  Google Scholar 

  • Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York

    Book  Google Scholar 

  • Rundle AG, Vineis P, Ahsan H (2005) Design options for molecular epidemiology research within cohort studies. Cancer Epidemiol Biomark Prev 14:1899–1907

    Article  Google Scholar 

  • Saarela O, Kulathinal S, Arjas E, Läärä E (2008) Nested case–control data utilized for multiple outcomes: a likelihood approach and alternatives. Stat Med 27:5991–6008

    Article  MathSciNet  Google Scholar 

  • Samuelsen SO (1997) A pseudolikelihood approach to analysis of nested case–control studies. Biometrika 84:379–394

    Article  MATH  MathSciNet  Google Scholar 

  • Samuelsen SO, Ånestad H, Skrondal A (2007) Stratified case–cohort analysis of general cohort sampling designs. Scand J Stat 34:103–119

    Article  MATH  Google Scholar 

  • Scheike TH, Juul A (2004) Maximum likelihood estimation for Cox’s regression model under nested case–control sampling. Biostatistics 5:193–206

    Article  MATH  Google Scholar 

  • Scott AJ, Wild CJ (1986) Logistic models under case-control or choice based sampling. J R Stat Soc Ser B 48:170–182

    MATH  MathSciNet  Google Scholar 

  • Scott AJ, Wild CJ (2002) Logistic models under case-control or choice based sampling. J R Stat Soc Ser B 64:207–219

    Article  MATH  MathSciNet  Google Scholar 

  • Støer NC, Samuelsen SO (2012) Comparison of estimators in nested case–control studies with multiple outcomes. Lifetime Data Anal 18:261–283

    Article  MathSciNet  Google Scholar 

  • Støer NC, Samuelsen SO (2013) Inverse probability weighting in nested case–control studies with additional matching—a simulation study. Stat Med 32:5328–5339

    Article  MathSciNet  Google Scholar 

  • Støer NC, Samuelsen SO (2014) multipleNCC: weighted Cox-regression for nested case-control data. http://CRAN.R-project.org/package=multipleNCC, R package version 1.0

  • Van Buuren S (2007) Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res 16:219–242

    Article  MATH  MathSciNet  Google Scholar 

  • Van Buuren S, Groothuis-Oudshoorn K (2011) Mice: multivariate imputation by chained equations in R. J Stat Softw 45:1–67

    Google Scholar 

  • White IR, Royston P (2009) Imputing missing covariate values for the Cox model. Stat Med 28:1982–1998

    Article  MathSciNet  Google Scholar 

  • White IR, Royston P, Wood AM (2011) Multiple imputation using chained equations: issues and guidance for practice. Stat Med 30:377–399

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

Most of this research was done when Ørnulf Borgan was visiting the Department of Medical Statistics at London School of Hygiene and Tropical Medicine the spring of 2014. The department is acknowledged for its hospitality and for providing the best working facilities. We also want to thank Nathalie Støer for letting us use her new R package multipleNCC before it was made publicly available.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ørnulf Borgan.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 35 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Borgan, Ø., Keogh, R. Nested case–control studies: should one break the matching?. Lifetime Data Anal 21, 517–541 (2015). https://doi.org/10.1007/s10985-015-9319-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-015-9319-y

Keywords

Navigation