Assessing discrimination of risk prediction rules in a clustered data setting

Rosner, Bernard; Qiu, Weiliang; Lee, Mei-Ling T.

doi:10.1007/s10985-012-9240-6

Assessing discrimination of risk prediction rules in a clustered data setting

Published: 22 December 2012

Volume 19, pages 242–256, (2013)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Bernard Rosner¹,
Weiliang Qiu¹ &
Mei-Ling T. Lee²

328 Accesses
5 Citations
Explore all metrics

Abstract

The AUC (area under ROC curve) is a commonly used metric to assess discrimination of risk prediction rules; however, standard errors of AUC are usually based on the Mann–Whitney U test that assumes independence of sampling units. For ophthalmologic applications, it is desirable to assess risk prediction rules based on eye-specific outcome variables which are generally highly, but not perfectly correlated in fellow eyes [e.g. progression of individual eyes to age-related macular degeneration (AMD)]. In this article, we use the extended Mann–Whitney U test (Rosner and Glynn, Biometrics 65:188–197, 2009) for the case where subunits within a cluster may have different progression status and assess discrimination of different prediction rules in this setting. Both data analyses based on progression of AMD and simulation studies show reasonable accuracy of this extended Mann–Whitney U test to assess discrimination of eye-specific risk prediction rules.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Article Open access 01 April 2016

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Diagn Radiol 143:29–36
Google Scholar
Hodges JL Jr, Lehmann EL (1956) The efficiency of some nonparametric competitors of the t test. Ann Math Stat 27:324–335
Article MathSciNet MATH Google Scholar
Kendall MG, Stuart A (1969) The advanced theory of statistics. Hafner, New York
MATH Google Scholar
Li G, Zhou K (2008) A unified approach to nonparametric comparison of receiver operating characteristic curves for longitudinal and clustered data. J Am Stat Assoc 103(482):705–713
Article MATH Google Scholar
National Eye Institute (2011) Age-related macular degeneration. http://www.nei.nih.gov/health/maculardegen/armd_facts.asp. Retrieved March 2011
Obuchowski NA (2003) Receiver operating characteristic curves and their use in radiology. Radiology 229:3–8
Article Google Scholar
Obuchowski NA, McClish DK (1997) Sample size determination for diagnostic accuracy studies involving binormal roc curve indices. Stat Med 16(13):1529–1542
Article Google Scholar
Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS (2008) Evaluating the added predictive ability of a new marker: from area under the roc curve to reclassification and beyond. Stat Med 27:157–172
Article MathSciNet Google Scholar
Rosner B, Glynn RJ (2009) Power and sample size estimation for the wilcoxon rank sum test with application to comparisons of c statistics from alternative prediction models. Biometrics 65:188–97
Article MathSciNet MATH Google Scholar
Rosner B, Glynn RJ, Lee MT (2006) Extension of the rank sum test for clustered data: two-group comparisons with group membership defined at the subunit level. Biometrics 62:1251–1259
Article MathSciNet MATH Google Scholar
Rubin DB (1987) Multiple imputation for non-response in surveys. Wiley, New York
Book Google Scholar
Seddon JM, Cote J, Rosner B (2003) Progression of age-related macular degeneration. Arch Ophthalmol 121:1728–1737
Article Google Scholar
Toledano AY, Gatsonis CA (1996) GEEs for ordinal categorical data: arbitrary patterns of missing responses and missingness in a key covariate. Biometrics 55:488–496
Article MathSciNet Google Scholar
Zou KH, O’Malley AJ, Mauri L (2007) Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. Circulation 115:654–657
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Institutes of Health Grant EY12269 from the National Eye Institute.

Author information

Authors and Affiliations

Channing Division of Network Medicine, Brigham and Women’s Hospital/Harvard Medical School, 181 Longwood Avenue, Boston, MA, 02115, USA
Bernard Rosner & Weiliang Qiu
Department of Epidemiology and Biostatistics, University of Maryland, College Park, MD, 20742, USA
Mei-Ling T. Lee

Authors

Bernard Rosner
View author publications
You can also search for this author in PubMed Google Scholar
Weiliang Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Mei-Ling T. Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernard Rosner.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 158 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rosner, B., Qiu, W. & Lee, ML.T. Assessing discrimination of risk prediction rules in a clustered data setting. Lifetime Data Anal 19, 242–256 (2013). https://doi.org/10.1007/s10985-012-9240-6

Download citation

Received: 09 January 2012
Accepted: 06 December 2012
Published: 22 December 2012
Issue Date: April 2013
DOI: https://doi.org/10.1007/s10985-012-9240-6

Keywords

Risk prediction \(\cdot \) ROC curves \(\cdot \) Clustered data \(\cdot \) GEE

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing discrimination of risk prediction rules in a clustered data setting

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 158 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Assessing discrimination of risk prediction rules in a clustered data setting

Abstract

Access this article

Similar content being viewed by others

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 158 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation