The AUC (area under ROC curve) is a commonly used metric to assess discrimination of risk prediction rules; however, standard errors of AUC are usually based on the Mann–Whitney U test that assumes independence of sampling units. For ophthalmologic applications, it is desirable to assess risk prediction rules based on eye-specific outcome variables which are generally highly, but not perfectly correlated in fellow eyes [e.g. progression of individual eyes to age-related macular degeneration (AMD)]. In this article, we use the extended Mann–Whitney U test (Rosner and Glynn, Biometrics 65:188–197, 2009) for the case where subunits within a cluster may have different progression status and assess discrimination of different prediction rules in this setting. Both data analyses based on progression of AMD and simulation studies show reasonable accuracy of this extended Mann–Whitney U test to assess discrimination of eye-specific risk prediction rules.
Kendall MG, Stuart A (1969) The advanced theory of statistics. Hafner, New YorkMATHGoogle Scholar
Li G, Zhou K (2008) A unified approach to nonparametric comparison of receiver operating characteristic curves for longitudinal and clustered data. J Am Stat Assoc 103(482):705–713MATHCrossRefGoogle Scholar
Obuchowski NA (2003) Receiver operating characteristic curves and their use in radiology. Radiology 229:3–8CrossRefGoogle Scholar
Obuchowski NA, McClish DK (1997) Sample size determination for diagnostic accuracy studies involving binormal roc curve indices. Stat Med 16(13):1529–1542CrossRefGoogle Scholar
Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS (2008) Evaluating the added predictive ability of a new marker: from area under the roc curve to reclassification and beyond. Stat Med 27:157–172MathSciNetCrossRefGoogle Scholar
Rosner B, Glynn RJ (2009) Power and sample size estimation for the wilcoxon rank sum test with application to comparisons of c statistics from alternative prediction models. Biometrics 65:188–97MathSciNetMATHCrossRefGoogle Scholar
Rosner B, Glynn RJ, Lee MT (2006) Extension of the rank sum test for clustered data: two-group comparisons with group membership defined at the subunit level. Biometrics 62:1251–1259MathSciNetMATHCrossRefGoogle Scholar