Estimating the hazard rate difference from case-cohort studies

Hu, Jie K.; Chan, Kwun C. G.; Couper, David J.; Breslow, Norman E.

doi:10.1007/s10654-021-00739-3

Estimating the hazard rate difference from case-cohort studies

METHODS
Published: 14 June 2021

Volume 36, pages 1129–1142, (2021)
Cite this article

European Journal of Epidemiology Aims and scope Submit manuscript

Jie K. Hu ORCID: orcid.org/0000-0002-7987-8419¹,
Kwun C. G. Chan¹,
David J. Couper² &
…
Norman E. Breslow¹^an1

840 Accesses
2 Altmetric
Explore all metrics

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The case-cohort design, among many two-phase sampling designs, substantially reduces the cost of an epidemiological study by selecting more informative participants within the full cohort for expensive variable measurements. Despite their benefits, additive hazards models, which estimate hazard differences, have rarely been used for the analysis of case-cohort studies due to the lack of software and application examples. In this paper, we describe a newly developed estimation method that fits the additive hazards models to general two-phase sampling studies along with the R package addhazard that implements it. It allows for missing covariates among cases, cohort stratification, robust variances, and the incorporation of auxiliary information from the full cohort to enhance inference precision. We demonstrate the use of this tool to estimate the association of the risk of coronary heart disease (CHD) with biomarkers high-sensitivity C-reactive protein (hs-CRP) and Lipoprotein-associated phospholipase A₂ (Lp-PLA₂) by analyzing the Atherosclerosis Risk in Communities Study, which adopted a two-phase sampling design for studying these two biomarkers. We show that the use of auxiliary variables from the full cohort based on calibration techniques improves the precision of the hazard difference being estimated. We observe a synergistic effect of the two biomarkers among participants with lower LDL cholesterol (LDL-C): the CHD hazard rate attributable to the combined action of high hs-CRP and high Lp-PLA₂ exceeded the sum of the CHD hazard rate attributable to each one independently by 11.58 (95% CI 2.16–21.01) cases per 1000 person-years. With higher LDL-C, we observe the CHD hazard rate attributable to the combined action of high hs-CRP and medium Lp-PLA₂ was less than the sum of their individual effects by 13.42 (95% CI 2.44–24.40) cases per 1000 person-years. This demonstration serves the dual purposes of illustrating analysis techniques and providing insights about the utility of hs-CRP and Lp-PLA₂ for identifying the high-risk population of CHD that the traditional risk factors such as the LDL-C may miss. Epidemiologists are encouraged to use this new tool to analyze other case-cohort studies and incorporate auxiliary variables embedded in the full cohort in their analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incorporating sampling weights into robust estimation of Cox proportional hazards regression model, with illustration in the Multi-Ethnic Study of Atherosclerosis

Article Open access 14 March 2020

Determining sample sizes for combined incident and prevalent cohort studies with and without follow-up

Article 14 February 2024

Case-cohort analysis with general additive-multiplicative hazard models

Article 01 October 2016

Availability of data and materials

The ARIC dataset can be requested through the ARIC website https://biolincc.nhlbi.nih.gov/studies/aric/. The dataset identifier is available upon request from the correspondence author. The NWTSG dataset is available in the R package addhazard. The data dictionary is provided in its reference manual.

Code availability

The analysis code is provided in the supplementary material and the software can be downloaded from the CRAN website https://cran.r-project.org/web/packages/addhazard/index.html or from the github website https://github.com/cran/addhazard.

References

Neyman J. Contribution to the theory of sampling human populations. J Am Stat Assoc. 1938;33(201):101–16.
Article Google Scholar
Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73(1):1–11.
Article Google Scholar
Borgan O, Langholz B, Samuelsen SO, Goldstein L, Pogoda J. Exposure stratified case-cohort designs. Lifetime Data Anal. 2000;6(1):39–58.
Article CAS Google Scholar
Cai J, Zeng D. Sample size/power calculation for case-cohort studies. Biometrics. 2004;60(4):1015–24.
Article Google Scholar
Gray RJ. Weighted analyses for cohort sampling designs. Lifetime Data Anal. 2009;15(1):24–40.
Article Google Scholar
Breslow NE, Wellner JA. Weighted likelihood for semiparametric models and two-phase stratified samples, with application to cox regression. Scand J Stat. 2007;34(1):86–102.
Article Google Scholar
Ballantyne CM, Hoogeveen RC, Bang H, Coresh J, Folsom AR, Heiss G, et al. Lipoprotein-associated phospholipase A2, high-sensitivity C-reactive protein, and risk for incident coronary heart disease in middle-aged men and women in the Atherosclerosis Risk In Communities (ARIC) study. Circulation. 2004;109(7):837–42.
Article CAS Google Scholar
InterAct Consortium and others. Design and cohort description of the InterAct Project: an examination of the interaction of genetic and lifestyle factors on the incidence of type 2 diabetes in the EPIC Study. Diabetologia. 2011; 54(9):2272.
van den Brandt PA, Goldbohm RA, Veer PV, Volovics A, Hermus RJ, Sturmans F. A large-scale prospective cohort study on diet and cancer in The Netherlands. J Clin Epidemiol. 1990;43(3):285–95.
Article Google Scholar
Sharp SJ, Poulaliou M, Thompson SG, White IR, Wood AM. A review of published analyses of case-cohort studies and recommendations for future reporting. PLoS ONE. 2014;9(6):357–81.
Article Google Scholar
Lin DY, Ying Z. Semiparametric analysis of the additive risk model. Biometrika. 1994;81(1):61–71.
Article Google Scholar
Aalen O. A model for nonparametric regression analysis of counting processes. In: Mathematical statistics and probability theory. Springer, New York; 1980. p. 1–25.
McKeague IW, Sasieni PD. A partly parametric additive risk model. Biometrika. 1994;81(3):501–14.
Article Google Scholar
Cox DR, Oakes D. Analysis of survival data. London: Chapman and Hall; 1984.
Google Scholar
Thomas DC. Use of auxiliary information in fitting nonproportional hazards models. In: Modern statistical methods in chronic disease epidemiology. Wiley, New York; 1986. p. 197–210.
Breslow NE, Day NE. The design and analysis of cohort studies. IARC Scientific Publications No 82. 1987; International Agency for Research on Cancer, Lyon.
Breslow NE, Day NE. Statistical methods in cancer research. International agency for research on cancer Lyon; 1980.
Sjölander A, Dahlqwist E, Zetterqvist J. A note on the noncollapsibility of rate differences and rate ratios. Epidemiology. 2016;27(3):356–9.
Article Google Scholar
Klein JP. Modelling competing risks in cancer studies. Stat Med. 2006;25(6):1015–34.
Article Google Scholar
Rothman KJ. Synergy and antagonism in cause-effect relationships. Am J Epidemiol. 1974;99(6):385–8.
Article CAS Google Scholar
Hu JK. addhazard: Fit Additive Hazards Models for Survival Analysis; 2020. R package version 1.2.0.
Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M. Using the whole cohort in the analysis of case-cohort data. Am J Epidemiol. 2009;169(11):1398–405.
Article Google Scholar
Ridker PM. High-sensitivity C-reactive protein potential adjunct for global risk assessment in the primary prevention of cardiovascular disease. Circulation. 2001;103(13):1813–8.
Article CAS Google Scholar
Kang S, Cai J, Chambless L. Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the Atherosclerosis Risk In Communities (ARIC) study. Biostatistics. 2013;14(1):28–41.
Article Google Scholar
Silva IT, Mello AP, Damasceno NR. Antioxidant and inflammatory aspects of lipoprotein-associated phospholipase A2 (Lp-PLA2): a review. Lipids Health Disease. 2011;10:170.
Article CAS Google Scholar
Rod NH, Lange T, Andersen I, Marott JL, Diderichsen F. Additive interaction in survival analysis: use of the additive hazards model. Epidemiology. 2012;23(5):733–7.
Article Google Scholar
D’angio GJ, Breslow NE, Beckwith JB, Evans A, Baum E, Delorimier A, et al. Treatment of Wilms’ tumor. Results of the third national Wilms’ tumor study. Cancer. 1989;64(2):349–60.
Article Google Scholar
Green DM, Breslow NE, Beckwith JB, Finklestein JZ, Grundy PE, Thomas PR, et al. Comparison between single-dose and divided-dose administration of dactinomycin and doxorubicin for patients with Wilms’ tumor: a report from the National Wilms’ Tumor Study Group. J Clin Oncol. 1998;16(1):237–45.
Article CAS Google Scholar
Hu J. A Z-estimation system for two-phase sampling with applications to additive hazards models and epidemiologic studies. University of Washington; 2014.
Breslow NE, Hu JK. Survival analysis of case-control data: a sample survey approach. In: Handbook of statistical methods for case-control studies. Chapman and Hall/CRC; 2018. p. 303–327.
Huber PJ. Robust estimation of a location parameter. Ann Math Stat. 1964;35(1):73–101.
Article Google Scholar
Pollard D. New ways to prove central limit theorems. Econom Theory. 1985;1(3):295–313.
Article Google Scholar
van der Vaart AW. Asymptotic statistics. Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press; 1998.
Book Google Scholar
Kulich M, Lin DY. Additive hazards regression for case-cohort studies. Biometrika. 2000;87(1):73–87.
Article Google Scholar
Nan B, Wellner JA. A general semiparametric \(Z\)-estimation approach for case-cohort studies. Statist Sin. 2013;23(3):1155–80.
Google Scholar
Deville JC, Särndal CE. Calibration estimators in survey sampling. J Am Stat Assoc. 1992;87(418):376–82.
Article Google Scholar
Sun Y, Qian X, Shou Q, Gilbert PB. Analysis of two-phase sampling data with semiparametric additive hazards models. Lifetime Data Anal. 2017;23(3):377–99.
Article Google Scholar
The ARIC investigators. The Atherosclerosis Risk In Cimmunities (ARIC) Study: Design and Objectives. American Journal of Epidemiology. 1989;129(4):687–702.
Pearson TA, Mensah GA, Alexander RW, Anderson JL, Cannon RO, Criqui M, et al. Markers of Inflammation and Cardiovascular Disease Application to Clinical and Public Health Practice: A Statement for Healthcare Professionals From the Centers for Disease Control and Prevention and the American Heart Association. circulation. 2003;107(3):499–511.
Chan KCG, Yam SCP, et al. Oracle, multiple robust and multipurpose calibration in a missing response problem. Stat Sci. 2014;29(3):380–96.
Google Scholar
Ford ES. Body mass index, diabetes, and C-reactive protein among US adults. Diabetes Care. 1999;22(12):1971–7.
Article CAS Google Scholar
Visser M, Bouter LM, McQuillan GM, Wener MH, Harris TB. Elevated C-reactive protein levels in overweight and obese adults. J Am Med Assoc. 1999;282(22):2131–5.
Article CAS Google Scholar
Rawson ES, Freedson PS, Osganian SK, Matthews CE, Reed G, Ockene IS. Body mass index, but not physical activity, is associated with C-reactive protein. Med Sci Sports Exerc. 2003;35(7):1160–6.
Article CAS Google Scholar
Ohsawa M, Okayama A, Nakamura M, Onoda T, Kato K, Itai K, et al. CRP levels are elevated in smokers but unrelated to the number of cigarettes and are decreased by long-term smoking cessation in male smokers. Prev Med. 2005;41(2):651–6.
Article CAS Google Scholar
Kulich M, Lin DY. Improving the efficiency of relative-risk estimation in case-cohort studies. J Am Stat Assoc. 2004;99(467):832–44.
Article Google Scholar

Download references

Acknowledgements

The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I, HHSN2682017000021). The authors thank the staff and participants of the ARIC study for their important contributions.

Funding

The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I, HHSN2682017000021). The second author is partially funded by US National Institutes of Health Grant R01HL122212 and the US National Science Foundation Grant DMS1711952.

Author information

Authors and Affiliations

Department of Biostatistics, University of Washington, Seattle, WA, 98105, USA
Jie K. Hu, Kwun C. G. Chan & Norman E. Breslow
University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
David J. Couper

Author notes

Norman Breslow is deceased. The first author thanks his guidance on her Ph.D. dissertation that leads to this work.
- Norman E. Breslow

Authors

Jie K. Hu
View author publications
You can also search for this author in PubMed Google Scholar
Kwun C. G. Chan
View author publications
You can also search for this author in PubMed Google Scholar
David J. Couper
View author publications
You can also search for this author in PubMed Google Scholar
Norman E. Breslow
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JH, GC, and NB conceptualized the paper. JH analyzed and led the manuscript drafting. GC and NB provided technical guidance and DC prepared the dataset. All authors contributed significantly to the manuscript editing.

Corresponding author

Correspondence to Jie K. Hu.

Ethics declarations

Conflict of interest

There are no known conflicts of interest.

Ethical approval

The Human Subjects Division of the University of Washington identified the research activity did not need IRB review and approval.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 249 kb)

Appendix

Table 4 Weighted correlation coefficients \(\rho\) between hs-CRP and phase I cohort variables

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, J.K., Chan, K.C.G., Couper, D.J. et al. Estimating the hazard rate difference from case-cohort studies. Eur J Epidemiol 36, 1129–1142 (2021). https://doi.org/10.1007/s10654-021-00739-3

Download citation

Received: 23 July 2020
Accepted: 13 March 2021
Published: 14 June 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s10654-021-00739-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating the hazard rate difference from case-cohort studies

Abstract

Access this article

Similar content being viewed by others

Incorporating sampling weights into robust estimation of Cox proportional hazards regression model, with illustration in the Multi-Ethnic Study of Atherosclerosis

Determining sample sizes for combined incident and prevalent cohort studies with and without follow-up

Case-cohort analysis with general additive-multiplicative hazard models

Availability of data and materials

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Author notes

Norman Breslow is deceased. The first author thanks his guidance on her Ph.D. dissertation that leads to this work.

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 249 kb)

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimating the hazard rate difference from case-cohort studies

Abstract

Access this article

Similar content being viewed by others

Incorporating sampling weights into robust estimation of Cox proportional hazards regression model, with illustration in the Multi-Ethnic Study of Atherosclerosis

Determining sample sizes for combined incident and prevalent cohort studies with and without follow-up

Case-cohort analysis with general additive-multiplicative hazard models

Availability of data and materials

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Author notes

Norman Breslow is deceased. The first author thanks his guidance on her Ph.D. dissertation that leads to this work.

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 249 kb)

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation