Biases in the Retrospective Calculation of Reliability and Responsiveness from Longitudinal Studies
We critically examine the common practice in quality of life assessment of identifying improved, unchanged and worsened subsamples retrospectively using some form of global rating, and then calculating test-retest reliability coefficients from the unchanged subsample and responsiveness coefficients from the changed subsample. We use data derived from Monte Carlo simulations to examine the relation between measures of reliability and responsiveness derived from retrospective studies using an unchanged subsample and coefficients based on prospective studies with known treatment effect sizes. We also use actual data from longitudinal studies to examine the fit between simulated and published data. Our results show that calculation of reliability from an unchanged subsample leads to an inflation of the computed coefficient from a typical range of 0.6–0.8 up to 0.85–0.95. We similarly demonstrate that responsiveness coefficients based on the changed subsamples overestimate the responsiveness of the instrument, so that even in situations where there is no overall change, the methods lead to an acceptably large responsiveness coefficient. Based on these results, we conclude that retrospective methods of calculating reliability and responsiveness coefficients based on unchanged samples lead to upwardly biased estimates, and should be discontinued.
KeywordsChange Score Reliability Coefficient Minimally Important Difference Clinical Epidemiology Responsiveness Coefficient
Unable to display preview. Download preview PDF.