Health Measurement Development and Interpretation

  • Andrew Firth
  • Dianne Bryant
  • Jacques Menetrey
  • Alan GetgoodEmail author


Outcome measures help clinicians assess the risks and benefits of treatment in relation to a multi-faceted definition of health. While surrogate outcomes, including performance-based tests, provide important measures of health, patient-reported outcome measures (PROMs) assess both specific and general factors of how a patient’s health affects their ability to participate in desired family and societal roles and activities. Clinicians electing to use measurement tools to evaluate patient progress, to inform decision-making, or for research purposes must understand the measurement properties of the instrument to select the most appropriate measure. An instrument with sufficient measurement properties will have demonstrated reliability, validity, and evidence of its ability to detect important change in the applicable population. For ease of communication, results should be presented using easily interpretable statistics that convey the clinical meaning of the results, including providing readers with a threshold with which to judge clinical importance and confidence intervals (CI) around within-group changes (if measuring pre- to post-intervention), around between-group differences (if comparing different interventions), and using summary measures such as number needed to treat (NNT). In this chapter, we will outline the purpose of different outcome measures, measurement properties, and methods of presenting the results to improve the broad communication of results.


  1. 1.
    Bryant D, Guyatt G. Patient reported outcome measures. In: Arnold R, editor. Pharmoeconomics. Boca Raton: CRC Press; 2009.Google Scholar
  2. 2.
    Cochrane Collaboration glossary page [cited 2018 Jan 16].
  3. 3.
    Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status measures. Statistics and strategies for evaluation. Control Clin Trials. 1991;12(4 Suppl):142S–58S.CrossRefGoogle Scholar
  4. 4.
    Escobar A, Quintana JM, Bilbao A, Aróstegui I, Lafuente I, Vidaurreta I. Responsiveness and clinically important differences for the WOMAC and SF-36 after total knee replacement. Osteoarthr Cartil. 2007;15(3):273–80.CrossRefGoogle Scholar
  5. 5.
    Goldsmith CH, Boers M, Bombardier C, Tugwell P. Criteria for clinically important changes in outcomes: development, scoring and evaluation of rheumatoid arthritis patient and trial profiles. OMERACT Committee. J Rheumatol. 1993;20(3):561–5.PubMedGoogle Scholar
  6. 6.
    Hewett TE, Myer GD, Ford KR, Heidt RS, Colosimo AJ, McLean SG, et al. Biomechanical measures of neuromuscular control and valgus loading of the knee predict anterior cruciate ligament injury risk in female athletes: a prospective study. Am J Sports Med. 2005;33(4):492–501.CrossRefGoogle Scholar
  7. 7.
    Jackowski D, Guyatt G. A guide to health measurement. Clin Orthop Relat Res. 2003;413:80–9.CrossRefGoogle Scholar
  8. 8.
    Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials. 1989;10(4):407–15.CrossRefGoogle Scholar
  9. 9.
    Kahn TL, Soheili A, Schwarzkopf R. Outcomes of total knee arthroplasty in relation to preoperative patient-reported and radiographic measures: data from the osteoarthritis initiative. Geriatr Orthop Surg Rehabil. 2013;4:117–26.CrossRefGoogle Scholar
  10. 10.
    Kellgren JH, Lawrence JS. Radiological assessment of osteo-arthrosis. Ann Rheum Dis. 1957;16(4):494–502.CrossRefGoogle Scholar
  11. 11.
    Kirshner B, Guyatt G. A methodological framework for assessing health indices. J Chronic Dis. 1985;38(1):27–36.CrossRefGoogle Scholar
  12. 12.
    Laupacis A, Sackett DL, Roberts RS. An assessment of clinically useful measures of the consequences of treatment. N Engl J Med. 1988;318(26):1728–33.CrossRefGoogle Scholar
  13. 13.
    Liang MH. Longitudinal construct validity: establishment of clinical meaning in patient evaluative instruments. Med Care. 2000;38(9 Suppl):II84–90.PubMedGoogle Scholar
  14. 14.
    Marshall G, Hays R, Nicholas R. Evaluating agreement between clinical assessment methods. Int J Methods Psychiatric Res. 1994;4:249–57.Google Scholar
  15. 15.
    Marx RG, Menezes A, Horovitz L, Jones EC, Warren RF. A comparison of two time intervals for test-retest reliability of health status instruments. J Clin Epidemiol. 2003;56(8):730–5.CrossRefGoogle Scholar
  16. 16.
    McDowell I, Newell C. Measuring health: a guide to rating scales and questionnaires. 2nd ed. New York: Oxford University Press; 1996.Google Scholar
  17. 17.
    Messick S. Validity. In: Linn R, editor. Educational measurement. Phoenix: Oryx Press; 1993. p. 13–103.Google Scholar
  18. 18.
    Paterno MV, Schmitt LC, Ford KR, Rauh MJ, Myer GD, Huang B, et al. Biomechanical measures during landing and postural stability predict second anterior cruciate ligament injury after anterior cruciate ligament reconstruction and return to sport. Am J Sports Med. 2010;38(10):1968–78.CrossRefGoogle Scholar
  19. 19.
    Reid A, Birmingham TB, Stratford PW, Alcock GK, Giffin JR. Hop testing provides a reliable and valid outcome measure during rehabilitation after anterior cruciate ligament reconstruction. Phys Ther. 2007;87(3):337–49.CrossRefGoogle Scholar
  20. 20.
    Rothman ML, Beltran P, Cappelleri JC, Lipscomb J, Teschendorf B, Group MFP-ROCM. Patient-reported outcomes: conceptual issues. Value Health. 2007;10(Suppl 2):S66–75.CrossRefGoogle Scholar
  21. 21.
    Schünemann HJ, Guyatt GH. Commentary—goodbye M(C)ID! Hello MID, where do you come from? Health Serv Res. 2005;40(2):593–7.CrossRefGoogle Scholar
  22. 22.
    Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86(2):420–8.CrossRefGoogle Scholar
  23. 23.
    Streiner D, Norman G. Health measurement scales: a practical guide to their development and use. Oxford: Oxford University Press; 1995.Google Scholar
  24. 24.
    Victor J, Ghijselings S, Tajdar F, Van Damme G, Deprez P, Arnout N, et al. Total knee arthroplasty at 15-17 years: does implant design affect outcome? Int Orthop. 2014;38(2):235–41.CrossRefGoogle Scholar
  25. 25.
    Ware J, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34(3):220–33.CrossRefGoogle Scholar
  26. 26.
    Wasserstein R, Lazar N. The ASA’s statement on p-values: context, process and purpose. Am Stat. 2016;70(2):129–33.CrossRefGoogle Scholar
  27. 27.
    Wiebe S, Guyatt G, Weaver B, Matijevic S, Sidwell C. Comparative responsiveness of generic and specific quality-of-life instruments. J Clin Epidemiol. 2003;56(1):52–60.CrossRefGoogle Scholar
  28. 28.
    World Health Organization. Preamble to the constitution of the World Health Organization as adopted by the international health conference, Geneva; 1948.Google Scholar

Copyright information

© ISAKOS 2019

Authors and Affiliations

  • Andrew Firth
    • 1
  • Dianne Bryant
    • 2
  • Jacques Menetrey
    • 3
  • Alan Getgood
    • 1
    Email author
  1. 1.Fowler Kennedy Sport Medicine Clinic, 3M CentreUniversity of Western OntarioLondonCanada
  2. 2.Faculty of Health Sciences, Elborn CollegeUniversity of Western OntarioLondonCanada
  3. 3.Centre de medicine du sport et de l’exercice, Hirslanden Clinique La CollineUniversity Hospital of GenevaGenevaSwitzerland

Personalised recommendations