Skip to main content

Advertisement

Log in

Estimating the hazard rate difference from case-cohort studies

  • METHODS
  • Published:
European Journal of Epidemiology Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The case-cohort design, among many two-phase sampling designs, substantially reduces the cost of an epidemiological study by selecting more informative participants within the full cohort for expensive variable measurements. Despite their benefits, additive hazards models, which estimate hazard differences, have rarely been used for the analysis of case-cohort studies due to the lack of software and application examples. In this paper, we describe a newly developed estimation method that fits the additive hazards models to general two-phase sampling studies along with the R package addhazard that implements it. It allows for missing covariates among cases, cohort stratification, robust variances, and the incorporation of auxiliary information from the full cohort to enhance inference precision. We demonstrate the use of this tool to estimate the association of the risk of coronary heart disease (CHD) with biomarkers high-sensitivity C-reactive protein (hs-CRP) and Lipoprotein-associated phospholipase A2 (Lp-PLA2) by analyzing the Atherosclerosis Risk in Communities Study, which adopted a two-phase sampling design for studying these two biomarkers. We show that the use of auxiliary variables from the full cohort based on calibration techniques improves the precision of the hazard difference being estimated. We observe a synergistic effect of the two biomarkers among participants with lower LDL cholesterol (LDL-C): the CHD hazard rate attributable to the combined action of high hs-CRP and high Lp-PLA2 exceeded the sum of the CHD hazard rate attributable to each one independently by 11.58 (95% CI 2.16–21.01) cases per 1000 person-years. With higher LDL-C, we observe the CHD hazard rate attributable to the combined action of high hs-CRP and medium Lp-PLA2 was less than the sum of their individual effects by 13.42 (95% CI 2.44–24.40) cases per 1000 person-years. This demonstration serves the dual purposes of illustrating analysis techniques and providing insights about the utility of hs-CRP and Lp-PLA2 for identifying the high-risk population of CHD that the traditional risk factors such as the LDL-C may miss. Epidemiologists are encouraged to use this new tool to analyze other case-cohort studies and incorporate auxiliary variables embedded in the full cohort in their analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Availability of data and materials

The ARIC dataset can be requested through the ARIC website https://biolincc.nhlbi.nih.gov/studies/aric/. The dataset identifier is available upon request from the correspondence author. The NWTSG dataset is available in the R package addhazard. The data dictionary is provided in its reference manual.

Code availability

The analysis code is provided in the supplementary material and the software can be downloaded from the CRAN website https://cran.r-project.org/web/packages/addhazard/index.html or from the github website https://github.com/cran/addhazard.

References

  1. Neyman J. Contribution to the theory of sampling human populations. J Am Stat Assoc. 1938;33(201):101–16.

    Article  Google Scholar 

  2. Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73(1):1–11.

    Article  Google Scholar 

  3. Borgan O, Langholz B, Samuelsen SO, Goldstein L, Pogoda J. Exposure stratified case-cohort designs. Lifetime Data Anal. 2000;6(1):39–58.

    Article  CAS  Google Scholar 

  4. Cai J, Zeng D. Sample size/power calculation for case-cohort studies. Biometrics. 2004;60(4):1015–24.

    Article  Google Scholar 

  5. Gray RJ. Weighted analyses for cohort sampling designs. Lifetime Data Anal. 2009;15(1):24–40.

    Article  Google Scholar 

  6. Breslow NE, Wellner JA. Weighted likelihood for semiparametric models and two-phase stratified samples, with application to cox regression. Scand J Stat. 2007;34(1):86–102.

    Article  Google Scholar 

  7. Ballantyne CM, Hoogeveen RC, Bang H, Coresh J, Folsom AR, Heiss G, et al. Lipoprotein-associated phospholipase A2, high-sensitivity C-reactive protein, and risk for incident coronary heart disease in middle-aged men and women in the Atherosclerosis Risk In Communities (ARIC) study. Circulation. 2004;109(7):837–42.

    Article  CAS  Google Scholar 

  8. InterAct Consortium and others. Design and cohort description of the InterAct Project: an examination of the interaction of genetic and lifestyle factors on the incidence of type 2 diabetes in the EPIC Study. Diabetologia. 2011; 54(9):2272.

  9. van den Brandt PA, Goldbohm RA, Veer PV, Volovics A, Hermus RJ, Sturmans F. A large-scale prospective cohort study on diet and cancer in The Netherlands. J Clin Epidemiol. 1990;43(3):285–95.

    Article  Google Scholar 

  10. Sharp SJ, Poulaliou M, Thompson SG, White IR, Wood AM. A review of published analyses of case-cohort studies and recommendations for future reporting. PLoS ONE. 2014;9(6):357–81.

    Article  Google Scholar 

  11. Lin DY, Ying Z. Semiparametric analysis of the additive risk model. Biometrika. 1994;81(1):61–71.

    Article  Google Scholar 

  12. Aalen O. A model for nonparametric regression analysis of counting processes. In: Mathematical statistics and probability theory. Springer, New York; 1980. p. 1–25.

  13. McKeague IW, Sasieni PD. A partly parametric additive risk model. Biometrika. 1994;81(3):501–14.

    Article  Google Scholar 

  14. Cox DR, Oakes D. Analysis of survival data. London: Chapman and Hall; 1984.

    Google Scholar 

  15. Thomas DC. Use of auxiliary information in fitting nonproportional hazards models. In: Modern statistical methods in chronic disease epidemiology. Wiley, New York; 1986. p. 197–210.

  16. Breslow NE, Day NE. The design and analysis of cohort studies. IARC Scientific Publications No 82. 1987; International Agency for Research on Cancer, Lyon.

  17. Breslow NE, Day NE. Statistical methods in cancer research. International agency for research on cancer Lyon; 1980.

  18. Sjölander A, Dahlqwist E, Zetterqvist J. A note on the noncollapsibility of rate differences and rate ratios. Epidemiology. 2016;27(3):356–9.

    Article  Google Scholar 

  19. Klein JP. Modelling competing risks in cancer studies. Stat Med. 2006;25(6):1015–34.

    Article  Google Scholar 

  20. Rothman KJ. Synergy and antagonism in cause-effect relationships. Am J Epidemiol. 1974;99(6):385–8.

    Article  CAS  Google Scholar 

  21. Hu JK. addhazard: Fit Additive Hazards Models for Survival Analysis; 2020. R package version 1.2.0.

  22. Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M. Using the whole cohort in the analysis of case-cohort data. Am J Epidemiol. 2009;169(11):1398–405.

    Article  Google Scholar 

  23. Ridker PM. High-sensitivity C-reactive protein potential adjunct for global risk assessment in the primary prevention of cardiovascular disease. Circulation. 2001;103(13):1813–8.

    Article  CAS  Google Scholar 

  24. Kang S, Cai J, Chambless L. Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the Atherosclerosis Risk In Communities (ARIC) study. Biostatistics. 2013;14(1):28–41.

    Article  Google Scholar 

  25. Silva IT, Mello AP, Damasceno NR. Antioxidant and inflammatory aspects of lipoprotein-associated phospholipase A2 (Lp-PLA2): a review. Lipids Health Disease. 2011;10:170.

    Article  CAS  Google Scholar 

  26. Rod NH, Lange T, Andersen I, Marott JL, Diderichsen F. Additive interaction in survival analysis: use of the additive hazards model. Epidemiology. 2012;23(5):733–7.

    Article  Google Scholar 

  27. D’angio GJ, Breslow NE, Beckwith JB, Evans A, Baum E, Delorimier A, et al. Treatment of Wilms’ tumor. Results of the third national Wilms’ tumor study. Cancer. 1989;64(2):349–60.

    Article  Google Scholar 

  28. Green DM, Breslow NE, Beckwith JB, Finklestein JZ, Grundy PE, Thomas PR, et al. Comparison between single-dose and divided-dose administration of dactinomycin and doxorubicin for patients with Wilms’ tumor: a report from the National Wilms’ Tumor Study Group. J Clin Oncol. 1998;16(1):237–45.

    Article  CAS  Google Scholar 

  29. Hu J. A Z-estimation system for two-phase sampling with applications to additive hazards models and epidemiologic studies. University of Washington; 2014.

  30. Breslow NE, Hu JK. Survival analysis of case-control data: a sample survey approach. In: Handbook of statistical methods for case-control studies. Chapman and Hall/CRC; 2018. p. 303–327.

  31. Huber PJ. Robust estimation of a location parameter. Ann Math Stat. 1964;35(1):73–101.

    Article  Google Scholar 

  32. Pollard D. New ways to prove central limit theorems. Econom Theory. 1985;1(3):295–313.

    Article  Google Scholar 

  33. van der Vaart AW. Asymptotic statistics. Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press; 1998.

    Book  Google Scholar 

  34. Kulich M, Lin DY. Additive hazards regression for case-cohort studies. Biometrika. 2000;87(1):73–87.

    Article  Google Scholar 

  35. Nan B, Wellner JA. A general semiparametric \(Z\)-estimation approach for case-cohort studies. Statist Sin. 2013;23(3):1155–80.

    Google Scholar 

  36. Deville JC, Särndal CE. Calibration estimators in survey sampling. J Am Stat Assoc. 1992;87(418):376–82.

    Article  Google Scholar 

  37. Sun Y, Qian X, Shou Q, Gilbert PB. Analysis of two-phase sampling data with semiparametric additive hazards models. Lifetime Data Anal. 2017;23(3):377–99.

    Article  Google Scholar 

  38. The ARIC investigators. The Atherosclerosis Risk In Cimmunities (ARIC) Study: Design and Objectives. American Journal of Epidemiology. 1989;129(4):687–702.

  39. Pearson TA, Mensah GA, Alexander RW, Anderson JL, Cannon RO, Criqui M, et al. Markers of Inflammation and Cardiovascular Disease Application to Clinical and Public Health Practice: A Statement for Healthcare Professionals From the Centers for Disease Control and Prevention and the American Heart Association. circulation. 2003;107(3):499–511.

  40. Chan KCG, Yam SCP, et al. Oracle, multiple robust and multipurpose calibration in a missing response problem. Stat Sci. 2014;29(3):380–96.

    Google Scholar 

  41. Ford ES. Body mass index, diabetes, and C-reactive protein among US adults. Diabetes Care. 1999;22(12):1971–7.

    Article  CAS  Google Scholar 

  42. Visser M, Bouter LM, McQuillan GM, Wener MH, Harris TB. Elevated C-reactive protein levels in overweight and obese adults. J Am Med Assoc. 1999;282(22):2131–5.

    Article  CAS  Google Scholar 

  43. Rawson ES, Freedson PS, Osganian SK, Matthews CE, Reed G, Ockene IS. Body mass index, but not physical activity, is associated with C-reactive protein. Med Sci Sports Exerc. 2003;35(7):1160–6.

    Article  CAS  Google Scholar 

  44. Ohsawa M, Okayama A, Nakamura M, Onoda T, Kato K, Itai K, et al. CRP levels are elevated in smokers but unrelated to the number of cigarettes and are decreased by long-term smoking cessation in male smokers. Prev Med. 2005;41(2):651–6.

    Article  CAS  Google Scholar 

  45. Kulich M, Lin DY. Improving the efficiency of relative-risk estimation in case-cohort studies. J Am Stat Assoc. 2004;99(467):832–44.

    Article  Google Scholar 

Download references

Acknowledgements

The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I, HHSN2682017000021). The authors thank the staff and participants of the ARIC study for their important contributions.

Funding

The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I, HHSN2682017000021). The second author is partially funded by US National Institutes of Health Grant R01HL122212 and the US National Science Foundation Grant DMS1711952.

Author information

Authors and Affiliations

Author notes

  1. Norman Breslow is deceased. The first author thanks his guidance on her Ph.D. dissertation that leads to this work.

    • Norman E. Breslow
Authors

Contributions

JH, GC, and NB conceptualized the paper. JH analyzed and led the manuscript drafting. GC and NB provided technical guidance and DC prepared the dataset. All authors contributed significantly to the manuscript editing.

Corresponding author

Correspondence to Jie K. Hu.

Ethics declarations

Conflict of interest

There are no known conflicts of interest.

Ethical approval

The Human Subjects Division of the University of Washington identified the research activity did not need IRB review and approval.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 249 kb)

Appendix

Appendix

Table 4 Weighted correlation coefficients \(\rho\) between hs-CRP and phase I cohort variables

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, J.K., Chan, K.C.G., Couper, D.J. et al. Estimating the hazard rate difference from case-cohort studies. Eur J Epidemiol 36, 1129–1142 (2021). https://doi.org/10.1007/s10654-021-00739-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10654-021-00739-3

Keywords

Navigation