Item Response Theory (IRT): Applications in Quality of Life Measurement, Analysis and Interpretation

  • David Cella
  • Chih-Hung Chang
  • Allen W. Heinemann


This article discusses basic concept of quality of life (QOL) measurement and item response theory (IRT). In a way complementary to that of traditional methods, IRT models can be applied to analyze and interpret QOL data collected in various settings. Growing interest in precise QOL measurement in research and clinical settings demands the development of psychometrically sound and clinically meaningful measurement tools. This in turn contributes to the appropriate use of QOL data that are collected. Advances in IRT, also referred to as modern test theory, make it possible for one to more critically evaluate questionnaire performance at its initial development and subsequent refinement and validation. It also offers better methodology to make interpretation of QOL data and comparisons between different populations or occasions more meaningful by converting ordinal observations into linear measures. Empirical results from different studies are provided to assist in the understanding of different IRT models and their applications. It is feasible and promising to integrate IRT models and advanced computer technology to develop a computerized adaptive testing platform to deliver tailored test to arrive at more precise QOL measurement. Administration of more targeted test items according to patient’s level of health via CAT with real-time scoring and reporting is not just possibility but a reality. This can facilitate better use of QOL information between patients and physicians, and ultimately improve patient care.


Differential Item Functioning Item Response Theory Item Bank Item Response Theory Model Computerize Adaptive Test 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Berzon R.A., Donnelly, M.A., Simpson, R.L., Jr., Simeon, G.P. and Tilson, H.H. (1995). Quality of bibliography and indexes: 1994 update. Quality of Life Research 4, 547–569.PubMedCrossRefGoogle Scholar
  2. 2.
    Erickson, P. (1998). On-line Guide to Quality of Life Assessment (OLGA). Google Scholar
  3. 3.
    Spilker, B., ed. (1996). Quality of Life and Pharmacoeconomics in Clinical Trials, (Second Edition). Philadelphia: Lippincott-Raven Publishers.Google Scholar
  4. 4.
    Brady, M.J., Cella, D.F., Mo, F., Bonomi, A.E., Tulsky, D.S., Lloyd, S.R., Deasy, S., Cobleigh, M. and Shiomoto, G. (1997). Reliability and validity of the Functional Assessment of Cancer Therapy — Breast (FACT-B) quality of life instrument. Journal of Clinical Oncology 15, 974–986.PubMedGoogle Scholar
  5. 5.
    Cella, D.F. (1997). Manual of the Functional Assessment of Chronic Illness Therapy (FACIT Scales — Version 4. Chicago: Center on Outcomes Research and Education (CORE), Evanston Northwestern Healthcare and Northwestern University.Google Scholar
  6. 6.
    Cella, D.F. and Bonomi, A.E. (1996). The Functional Assessment of Cancer Therapy (FACT) and Functional Assessment of HIV Infection (FAHI) quality of life measurement system. In: Spilker, B. (ed.), Quality of Life and Pharmacoeconomics in Clinical Trials (Second Edition). Philadelphia: Lippincott-Raven Publishers.Google Scholar
  7. 7.
    Cella, D.F., Tulsky, D.S., Gray, G., Sarafian, B., Lloyd, S., Linn, E., Bonomi, A., Silberman, M., Yellen, S.B., Winicour, P., Brannon, J., Eckberg, K., Purl, S., Blendowski, C., Goodman, M., Barnicle, M., Stewart, I., McHale, M., Bonomi, P., Kaplan, E., Taylor, S., Thomas, C. and Harris, J. (1993) The Functional Assessment of Cancer Therapy (FACT) scale: development and validation of the general measure. Journal of Clinical Oncology 11, 570–579.PubMedGoogle Scholar
  8. 8.
    Cella, D.F., Bonomi, A.E., Lloyd, S.R., Tulsky, D.S., Kaplan, E. and Bonomi, P. (1995). Reliability and validity of the Functional Assessment of Cancer Therapy-Lung (FACT-L) quality of life instrument. Lung Cancer 12, 199–220.PubMedCrossRefGoogle Scholar
  9. 9.
    D’Antonio, L.L., Zimmerman, G.J., Cella, D.F. and Long, S.A. (1996). Quality of life and functional status measures in patients with head and neck cancer. Archives of Otolaryngology-Head & Neck Surgery 122, 482–487.CrossRefGoogle Scholar
  10. 10.
    Esper, P., Mo, F., Chodak, G., Sinner, M., Pienta, K. and Cella, D. (1997). Measuring quality of life in men with prostate cancer using the Functional Assessment of Cancer Therapy — Prostate (FACT-P) instrument. Urology 50, 920–928.PubMedCrossRefGoogle Scholar
  11. 11.
    McQuellon, R.P., Russell, G.B., Cella, D.F., Craven, B.L., Brady, M., Bonomi, A.E. and Hurd, D.D. (1997). Quality of life measurement in bone marrow transplantation: development of the Functional Assessment of Cancer Therapy — Bone Marrow Transplant (FACT-BMT) scale. Bone Marrow Transplantation 19, 357–368.PubMedCrossRefGoogle Scholar
  12. 12.
    Ward, W.L., Hahn, E.A., Mo, F., Hernandez, L., Tulsky, D.S. and Cella, D. (1999). Reliability and validity of the Functional Assessment of Cancer Therapy — Colorectal (FACT-C) quality of life instrument. Quality of Life Research 8, 181–195.PubMedCrossRefGoogle Scholar
  13. 13.
    Peterman, A.H., Cella, D.F., Mo, F. and McCain, N. (1997). Psychometric validation of the revised Functional Assessment of Human Immunodeficiency Virus Infection (FAHI) quality of life instrument. Quality of Life Research 6, 572–584.PubMedCrossRefGoogle Scholar
  14. 14.
    Cella, D.F., McCain, N.L., Peterman, A.H., Mo, F. and Wolen, D. (1996). Development and validation of the Functional Assessment of Human Immunodeficiency Virus Infection (FAHI) quality of life instrument. Quality of Life Research 5, 450–463.PubMedCrossRefGoogle Scholar
  15. 15.
    Cella, D.F., Dineen, K., Arnason, B., Reder, A., Webster, K.A., Karabatsos, G., Chang, C., Lloyd, S., Mo, F., Stewart, J. and Stefoski, D. (1996). Validation of the Functional Assessment of Multiple Sclerosis quality of life instrument. Neurology 47, 129–139.PubMedCrossRefGoogle Scholar
  16. 16.
    Heinemann, A.W., Hamilton, B.B., Linacre, J.M., Wright, B.D. and Granger, C. (1995). Functional status and therapeutic intensity during inpatient rehabilitation. American Journal of Physical Medicine and Rehabilitation 74, 315–325.PubMedCrossRefGoogle Scholar
  17. 17.
    Linacre, J.M., Heinemann, A.W., Wright, B.D., Granger, C.V. and Hamilton, B.B. (1994). The structure and stability of the functional independence measure. Archives of Physical Medicine and Rehabilitation 75, 127–132.PubMedGoogle Scholar
  18. 18.
    Segal, M.E., Heinemann, A.W., Schall, R.R. and Wright, B.D. (1997). Rasch analysis of a brief physical ability scale for long-term outcomes of stroke. Physical Medicine and Rehabilitation: State of the Art Reviews 11, 385–396.Google Scholar
  19. 19.
    Fallowfield, D., Ratcliffe, D. and Souhami, R. (1997). Clinicians? attitudes to clinical trials of cancer therapy. European Journal of Cancer 33, 2221–2229.PubMedCrossRefGoogle Scholar
  20. 20.
    Morris, J., Perez, D. and McNoe, B. (1998). The use of quality of life data in clinical practice. Quality of Life Research 7, 85–91.PubMedCrossRefGoogle Scholar
  21. 21.
    Taylor, K.M., Feldstein, M.L., Skeel, R.T., Pandya, K.J., Ng, P. and Carbone, P.P. (1994). Fundamental dilemmas of the randomized clinical trial process: results of a survey of the 1,737 ECOG investigators. Journal of Clinical Oncology 12, 1796–1805.PubMedGoogle Scholar
  22. 22.
    Wasson, J., Keller, A., Rubenstein, L., Hays, R., Nelson, E. and Johnson, D. (1992). Benefits and obstacles of health status assessment in ambulatory settings. Medical Care 30, MS42–49.CrossRefGoogle Scholar
  23. 23.
    Gough, I.R. and Dalgleish, L.I. (1991). What value is given to quality of life assessment by health professionals considering response to palliative chemotherapy for advanced cancer? Cancer 68, 220–225.PubMedCrossRefGoogle Scholar
  24. 24.
    Taylor, K.M., DePetrillo, D., Macdonald, K., Awrey, J., Nicolson, J., Ng, P. and the Clinical Advisory Committee. (1993). Quality of Life (QOL) information: how do/would oncologists use it? Proceedings of ASCO 12, 1573.Google Scholar
  25. 25.
    Bergner, M. (1989). Quality of life, health status and clinical research. Medical Care 27, S148–156.PubMedCrossRefGoogle Scholar
  26. 26.
    Deyo, R.A. and Patrick, D.L. (1989) Barriers to the use of health status measures in clinical investigation, patient care and policy research. Medical Care 27 (Suppl), 254-S268.CrossRefGoogle Scholar
  27. 27.
    Nelson, E.C. and Berwick, D.M. (1989). The measurement of health status in clinical practice. Medical Care 27 (Suppl 3), 77–90.CrossRefGoogle Scholar
  28. 28.
    Chang, C.-H. and Cella, D.F. (1997). Equating health-related quality of life instruments in applied oncology settings. Physical Medicine and Rehabilitation: State of the Art Reviews 11, 397–406.Google Scholar
  29. 29.
    McHorney, C.A. (1997). Generic health measurement: past accomplishments and a measurement paradigm for the 21st century. Annals of Internal Medicine 127, 743–750.PubMedCrossRefGoogle Scholar
  30. 30.
    Revicki, D.A. and Cella, D.F. (1997). Health status assessment for the twenty-first century: item response theory, item banking and computer adaptive testing. Quality of Life Research 6, 595–600.PubMedCrossRefGoogle Scholar
  31. 31.
    Cella, D.F. (1995). Measuring quality of life in palliative care. Seminars in Oncology 22 (Suppl 3), 73–81.PubMedGoogle Scholar
  32. 32.
    Cella, D. and Webster, K. (1997). Linking outcomes management to quality-of-life measurement. Oncology 11, 232–235.PubMedGoogle Scholar
  33. 33.
    Lord, F.M. (1980). Applications of Item Response Theory to Practical Testing Problems. Hillsdale, NJ: Erlbaum L.Google Scholar
  34. 34.
    Lord, F.M. and Novick, M.R. (1968). Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley.Google Scholar
  35. 35.
    Rasch, G. (1960). Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen: Danish Institute for Educational Research.Google Scholar
  36. 36.
    Masters, G.N. and Wright, B.D. (1984). The essential process of a family of measurement models. Psychometrika 49, 529–544.CrossRefGoogle Scholar
  37. 37.
    Thissen, D. and Steinberg, L. (1984) Taxonomy of item response models. Psychometrika 51, 567–578.CrossRefGoogle Scholar
  38. 38.
    van der Linden, W.J. and Hambleton, R.K., eds. (1997). Handbook of Modern Item Response Theory. New York: Springer-Verlag.Google Scholar
  39. 39.
    Wright, B.D. and Linacre, J.M. (1997). A User’s Guide to BIGSTEPS Rasch Model Computer Program, Version 2.7. Chicago: Mesa Press.Google Scholar
  40. 40.
    Linacre, J.M. and Wright, B.D. (1997). Facets: Many-Faceted Rasch Analysis. Chicago: MESA Press.Google Scholar
  41. 41.
    Muraki, E. and Bock, R.D. (1993). PARSCALE: Parameter scaling of rating data (Version 3.5). Chicago: Scientific Software International, Inc.Google Scholar
  42. 42.
    Wright, B.D. and Masters, G.N. (1982) Rating Scale Analysis: Rasch Measurement. Chicago: Mesa Press.Google Scholar
  43. 43.
    Holland, P.W. and Wainer, H., eds. (1993). Differential Item Functioning. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  44. 44.
    Cella, D.F. and Chang, C.-H. (1997). Establishing equivalence measures from different health-related quality of life instruments. Quality of Life Research 6, 631.Google Scholar
  45. 45.
    Chang, C.-H. and Cella, D.F. (1997). Equating health-related quality of life instruments in applied oncology settings. Physical Medicine and Rehabilitation: State of the Art Reviews 11, 397–406.Google Scholar
  46. 46.
    Andrich, D. (1978). Application of a psychometric rating model to ordered categories which are scored with successive integers. Applied Psychological Measurement 2, 581–594.CrossRefGoogle Scholar
  47. 47.
    Andrich, D. (1978). Scaling attitude items constructed and scored in the Likert tradition. Educational and Psychological Measurement 38, 665–680.CrossRefGoogle Scholar
  48. 48.
    Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika 43, 561–573.CrossRefGoogle Scholar
  49. 49.
    Chang, C.-H. (1998). Confirming test structure and measurement characteristics. Rasch Measurement Transaction 12, 622–623.Google Scholar
  50. 50.
    Gehlert, S. and Chang, C.-H. (1998). Psychometric properties of the Multidimensional Health Locus of Control scales in measuring patients with epilepsy. Journal of Outcome Measurement 2, 173–190.PubMedGoogle Scholar
  51. 51.
    Prieto, L., Alonso, J., Lamarca, R. and Wright, B.D. (1998). Rasch measurement for reducing the items of the Nottingham Health Profile. Journal of Outcome Measurement 2, 285–301.PubMedGoogle Scholar
  52. 52.
    Linacre, J.M. (1998). Detecting multidimensionality: which residual data-type works best? Journal of Outcome Measurement 2, 266–283.PubMedGoogle Scholar
  53. 53.
    Kelderman, H. and Rijkes, C.P.M. (1994). Loglinear multidimensional IRT models for polytomously scored items. Psychometrika 59, 149–176.CrossRefGoogle Scholar
  54. 54.
    Choppin, B. (1968). An item bank using sample-free calibration. Nature 219, 870–872.PubMedCrossRefGoogle Scholar
  55. 55.
    Choppin, B. (1976). Recent developments in item banking. In: de Gruijter, D.N.M. and van der Kamp, L.J.T. (eds.), Advances in Psychological and Educational Measurement New York: John Wiley.Google Scholar
  56. 56.
    Choppin, B. (1978). Item banking and the monitoring of achievement [Research in Progress, Series No. 1]. Slough, England: National Foundation for Educational Research.Google Scholar
  57. 57.
    Choppin, B. (1979). Testing the questions; the Rasch model and item banking. MESA Research Memorandum No. 49. Chicago: University of Chicago, MESA Psychometric Laboratory.Google Scholar
  58. 58.
    Choppin, B. (1981). Educational measurement and the item bank model. In: Lacey, C. and Lawton, D. (eds.), Issues in Evaluation and Accountability. London: Methuen.Google Scholar
  59. 59.
    Weiss, D.J. and Kingsbury, G.G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361–375.CrossRefGoogle Scholar
  60. 60.
    Bergstrom, B.A. and Lunz, M.E. (1992). Confidence in pass/fail decisions for computer adaptive and paper and pencil examinations. Evaluation and the Health Professions 15, 453–464.CrossRefGoogle Scholar
  61. 61.
    Bunderson, V.C., Inouye, D.K. and Olsen, J.B. (1986). The four generations of computerized educational measurement. In: Linn, R.L. (ed.), Educational Measurement. New York: Macmillan Publishing.Google Scholar
  62. 62.
    Lautenschlager, G.J. and Flaherty, V.L. (1990). Computer administration of questions: more desirable or more social desirability? Journal of Applied Psychology 75, 310–314.CrossRefGoogle Scholar
  63. 63.
    Reckase, M.D. (1989). Adaptive testing: the evolution of a good idea. Educational Measurement: Issues and Practice 8, 3.CrossRefGoogle Scholar
  64. 64.
    Butcher, J.N. (1987). Computerized Psychological Assessment: A Practitioner’s Guide. New York: Basic Books.Google Scholar
  65. 65.
    Waller, N.G. and Reise, S.P. (1989). Computerized adaptive personality assessment: An illustration with the absorption scale. Journal of Personality and Social Psychology 57, 1051–1058.PubMedCrossRefGoogle Scholar
  66. 66.
    Weiss, D.J. (1985). Adaptive testing by computer. Journal of Consulting and Clinical Psychology 53, 774–789.PubMedCrossRefGoogle Scholar
  67. 67.
    Sand, W.A., Waters, B.K., and McBride, J.R., eds. (1997). Computer adaptive testing: From inquiry to operation. Washington, DC: American Psychological Association.Google Scholar
  68. 68.
    Fisher, W.P., Jr. (1997). Equating the MOS SF36 and the LSU HIS Physical Functioning Scales. Journal of Outcome Measurement 1, 329–362.PubMedGoogle Scholar
  69. 69.
    Fisher, W.P., Jr. (1997). Physical disability construct convergence across instruments: towards a universal metric. Journal of Outcome Measurement 1, 87–113.PubMedGoogle Scholar
  70. 70.
    Fisher, W.P., Jr. (1998). A research program for accountable and patient-centered health outcome measures. Journal of Outcome Measurement 2, 222–239.PubMedGoogle Scholar
  71. 71.
    Green, B.F., Bock, R.D., Humphreys, L.G., Linn, R.B. and Reckase, M.D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement 21, 347–360.CrossRefGoogle Scholar
  72. 72.
    McKinley, R.L. and Reckase, M.D. (1980). Computer applications to ability testing. Association for Educational Data Systems Journal 13, 193–203.Google Scholar
  73. 73.
    McKinley, R.L. and Reckase, M.D. (1984). Implementing an adaptive testing program in an instructional program environment. Paper presented at the meeting of the American Educational Research Association, New Orleans, LA.Google Scholar
  74. 74.
    Urry, V.W. (1977). Tailored testing: a successful application of latent trait theory. Journal of Educational Measurement 14, 181–196.CrossRefGoogle Scholar
  75. 75.
    Olsen, J.B., Maynes, D.D., Slawson, D. and Ho, K. (1986). Comparison and equating of paper-administered, computer-administered and computerized adaptive test of achievement. Paper presented at the meeting of the American Educational Research Association, San Francisco, CA.Google Scholar
  76. 76.
    Weiss, D.J. (1983). New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing. New York: Academic Press.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2002

Authors and Affiliations

  • David Cella
    • 1
  • Chih-Hung Chang
    • 1
  • Allen W. Heinemann
  1. 1.Northwestern UniversityUSA

Personalised recommendations