The Use of Soft Endpoints in Clinical Trials: The Search for Clinical Significance

  • Janet Wittes


Measures of health-related quality of life and other “soft” endpoints have appeal to clinical trialists because of their direct relevance to the patient. Unfortunately, while one can define “statistical significance” precisely, what constitutes “clinical significance” remains elusive. A very small difference in a scale, while statistically significant, may have little relevance to the individual patient. Cardiologists have developed a number of soft endpoints, for example, the Killip Scale and the New York Heart Association Score, that define easily recognizable and distinguishable scenarios. Clinical trials in cardiology have used these scores as entry criteria or, occasionally, as primary endpoints and the field has been able to interpret results clinically. Many other fields rely on scales that lack verbal tags to clinical scenarios. The clinical community often cannot interpret data from such scales, even though the scales themselves are quite reliable, precise, and sensitive to change. Without meaningful approaches to defining clinical significance, such scales are unlikely to become acceptable in clinical trials except, perhaps, as exploratory endpoints. This paper discusses several approaches to defining clinical significance, such as attaching changes in scale to changes in risk as defined epidemiologically, matching levels of scale to objective levels of function, using expert panels and groups of patients to calibrate the scales, and adopting the increasingly popular metric “reliable change.”


Amyotrophic Lateral Sclerosis Verbal Fluency Oral Mucositis Reliable Change Visual Analogue Scale Scale 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Huntington Study Group (1996). Unified Huntington’s Disease Rating Scale: reliability and consistency. Movement Disorders 11, 136–142.CrossRefGoogle Scholar
  2. 2.
    Heald, S., Riddle, D. and Lamb, R. (1997). The Shoulder Pain and Disability Index: the construct validity and responsiveness of a region-specific disability measure. Physical Therapy 77, 1079–1089.PubMedGoogle Scholar
  3. 3.
    Gould, A. (1980). A new approach to the analysis of clinical drug trials with withdrawals. Biometrics 36, 721–727.PubMedCrossRefGoogle Scholar
  4. 4.
    Little, R.J.A. and Rubin, D.B. (1987). Statistical Analysis with Missing Data. New York: John Wiley and Sons, Inc.Google Scholar
  5. 5.
    Bernhard, J. and Gelber, R., eds. (1998). Workshop on missing data in quality of life research in clinical trials: practical and methodological issues. Statistics in Medicine 17, 511–796.Google Scholar
  6. 6.
    Gelber, R., Gelman, R. and Goldhirsch, A. (1989). A quality-of-life-oriented endpoint for comparing therapies. Biometrics 45, 781–795.PubMedCrossRefGoogle Scholar
  7. 7.
    Cole, B., Gelber, R. and Anderson, K. (1994). Parametric approaches to quality-adjusted survival analysis. Biometrics 50, 621–631.PubMedCrossRefGoogle Scholar
  8. 8.
    Murray, S. and Cole, B. (2000). Variance and sample size calculations in quality-of-life-adjusted survival analysis (Q-TWiST). Biometrics 56, 173–182.PubMedCrossRefGoogle Scholar
  9. 9.
    Lachenbruch P (2001). Comparisons of two-part models with competitors. Statistics in Medicine 20, 1215–1234.PubMedCrossRefGoogle Scholar
  10. 10.
    Shih, W. and Quan, H. (1997). Testing for treatment differences with dropouts present in clinical trials-a composite approach. Statistics in Medicine 11, 1225–1239.CrossRefGoogle Scholar
  11. 11.
    Stephens, S. (1957). On the psychophysical law. Psychological Reviews 64, 153–181.CrossRefGoogle Scholar
  12. 12.
    Cox, D.R., Fitzpatrick, R., Fletcher, AE, Gore, SM, Spiegelhalter, D.J. and Jones, D.J. (1992). Quality of life assessment: can we keep it simple? Journal of the Royal Statistical Society, Series A 155, 353–393.CrossRefGoogle Scholar
  13. 13.
    Sonis, S.T., Eilers, J.P., Epstein, J.B., LeVoque, F.G., Liggett, W.H., Jr., Mulagha, M.T., Peterson, D.E., Rose, A.H., Schubert, M.M., Spijkervet, F.K. and Wittes, J. (1999). Validation of a new scoring system for the assessment of clinical trial research of oral mucositis induced by radiation or chemotherapy. Cancer 85, 2103–2113.PubMedCrossRefGoogle Scholar
  14. 14.
    Wittes, J., Lakatos, E. and Probstfield, J. (1989). Surrogate endpoints in clinical trials: cardiovascular diseases. Statistics in Medicine 8, 415–425.PubMedCrossRefGoogle Scholar
  15. 15.
    The ALS CNTF Treatment Study (ACTS) Phase MI Study Group (1996). The Amyotrophic Lateral Sclerosis Functional Rating Scale. Assessment of activities of daily living in patients with amyotrophic lateral sclerosis. Archives of Neurology 53, 141–147.CrossRefGoogle Scholar
  16. 16.
    Temkin, N.R., Heaton, R.K., Grant, I. and Dikonen, S.S. (1999). Detecting significant change in neuropsychological test performance: a comparison of four models. Journal of the International Neuropsychological Society 5, 357–369.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2002

Authors and Affiliations

  • Janet Wittes
    • 1
  1. 1.Statistics CollaborativeUSA

Personalised recommendations