Modern Psychometric Approaches to Analysis of Scales for Health-Related Quality of Life

  • Jakob Bue BjornerEmail author
  • Per Bech


In recent years, much effort has been invested in the development of new instruments for assessment of health-related quality of life (HRQOL). For many new instruments, modern psychometric methods, such as item response theory (IRT) models, have been used, either as supplemental to classical psychometric testing or as the primary methodological approach. We will use the term modern psychometric methods to refer to psychometric methods for multi-item scales that (1) examine the contribution of each item to the measurement properties of the overall scale and (2) recognize that items are categorical. The models include Rasch models (Rasch 1980; Fischer and Molenaar 1995), other IRT models (Samejima 1969; van der Linden and Hambleton 1997), and factor analytic models for categorical data (Muthén 1984). “Modern” psychometric methods have actually a rather long history within psychiatric research (both focusing on self-reported scales (Bech et al. 1978) and psychiatric outcome rating scales (Bech et al. 1984)). During the past 25 years, modern psychometric methods have increasingly been used in the analysis of patient-reported outcome measures (Teresi et al. 1989; Haley et al. 1994). For example, the NIH-sponsored Patient-Reported Outcomes Measurement Information System (PROMIS) project relies primarily on modern psychometric methods (Reeve et al. 2007). Similarly, modern psychometric analyses have started to be adopted for analysis of patient-reported HRQOL measures for patients with schizophrenia (D’haenen 1996; Pan et al. 2007; Boyer et al. 2010; Reise et al. 2011a; Laurens et al. 2012; Mojtabai et al. 2012; Chen et al. 2013; Michel et al. 2013; Park et al. 2015; Galindo-Garre et al. 2015; Norholm and Bech 2006). The present chapter provides an introduction to modern psychometric methods and discusses their potential use for analyses of HRQOL data from patients with schizophrenia. Rather than focusing on one particular approach, we will show what the methods have in common and how they can supplement each other.


Item Response Theory Item Response Theory Model Grade Response Model Rating Scale Model Item Response Theory Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Andersen EB. A goodness of fit test for the Rasch model. Psychometrika. 1973;38:123–40.CrossRefGoogle Scholar
  2. Andrich D. A rating formulation for ordered response categories. Psychometrika. 1978;43:561–73.CrossRefGoogle Scholar
  3. Awad AG, Voruganti LN, Heslegrave RJ. A conceptual model of quality of life in schizophrenia: description and preliminary clinical validation. Qual Life Res. 1997;6:21–6.CrossRefPubMedGoogle Scholar
  4. Bech P. Clinical assessments of positive mental health. In: Jeste DV, Palmer BW, editors. Positive psychiatry: a clinician handbook. Washington DC: American Psychiatric Publishing; 2015. p. 127–43.Google Scholar
  5. Bech P, Allerup P, Rosenberg R. The Marke-Nyman temperament scale. Evaluation of transferability using the Rasch item analysis. Acta Psychiatr Scand Suppl. 1978;57:49–58.CrossRefGoogle Scholar
  6. Bech P, Allerup P, Reisby N, Gram LF. Assessment of symptom change from improvement curves on the Hamilton depression scale in trials with antidepressants. Psychopharmacology (Berl). 1984;84:276–81.CrossRefGoogle Scholar
  7. Bech P, Allerup P, Larsen ER, Csillag C, Licht RW. The Hamilton Depression Scale (HAM-D) and the Montgomery-Asberg Depression Scale (MADRS). A psychometric re-analysis of the European genome-based therapeutic drugs for depression study using Rasch analysis. Psychiatry Res. 2014;217:226–32.CrossRefPubMedGoogle Scholar
  8. Bjorner JB, Kosinski M, Ware Jr JE. Calibration of an item pool for assessing the burden of headaches: an application of item response theory to the headache impact test (HIT). Qual Life Res. 2003;12:913–33.CrossRefPubMedGoogle Scholar
  9. Bjorner JB, Chang CH, Thissen D, Reeve BB. Developing tailored instruments: item banking and computerized adaptive assessment. Qual Life Res. 2007;16 Suppl 1:95–108.CrossRefPubMedGoogle Scholar
  10. Bock RD. The nominal categories model. In: van der Linden WJ, Hambleton RK, editors. Handbook of modern item response theory. Berlin: Springer; 1997. p. 3–50.Google Scholar
  11. Bock RD, Mislevy RJ. Adaptive EAP estimation of ability in a microcomputer environment. Appl Psychol Meas. 1982;6:431–44.CrossRefGoogle Scholar
  12. Boyer L, Simeoni MC, Loundou A, D’Amato T, Reine G, Lancon C, et al. The development of the S-QoL 18: a shortened quality of life questionnaire for patients with schizophrenia. Schizophr Res. 2010;121:241–50.CrossRefPubMedGoogle Scholar
  13. Boyer L, Millier A, Perthame E, Aballea S, Auquier P, Toumi M. Quality of life is predictive of relapse in schizophrenia. BMC Psychiatry. 2013;13:15.CrossRefPubMedPubMedCentralGoogle Scholar
  14. Cai L, Thissen D, du Troit SHC. IRTPRO for windows. [Computer software]. Lincolnwood: Scientific Software International; 2011a.Google Scholar
  15. Cai L, Yang JS, Hansen M. Generalized full-information item bifactor analysis. Psychol Methods. 2011b;16:221–48.CrossRefPubMedPubMedCentralGoogle Scholar
  16. Chen YL, Hsiung PC, Chung L, Chen SC, Pan AW. Psychometric properties of the mastery scale-Chinese version: applying classical test theory and Rasch analysis. Scand J Occup Ther. 2013;20:404–11.CrossRefPubMedGoogle Scholar
  17. Christensen KB, Bjorner JB, Kreiner S, Petersen JH. Tests for unidimensionality in polytomous Rasch models. Psychometrika. 2002;67:563–74.CrossRefGoogle Scholar
  18. Connell J, O’Cathain A, Brazier J. Measuring quality of life in mental health: are we asking the right questions? Soc Sci Med. 2014;120:12–20.CrossRefPubMedPubMedCentralGoogle Scholar
  19. D’haenen H. Measurement of anhedonia. Eur Psychiat. 1996;11:335–43.CrossRefGoogle Scholar
  20. Drasgow F, Levine MV, Williams EA. Appropriateness measurement with polychotomous item response models and standardized indices. Br J Math Stat Psychol. 1985;38:67–86.CrossRefGoogle Scholar
  21. Ellervik C, Kvetny J, Bech P. The relationship between sleep length and restorative sleep in major depression. Results from the Danish General Suburban Population. Psychother Psychosom. 2016;85(1):45–6.CrossRefPubMedGoogle Scholar
  22. Embretson SE. Implications of a multidimensional latent trait model for measuring change. In: Collins LM, Horn J, editors. Best methods for the analysis of change. Washington DC: American Psychological Association; 1991. p. 184–203.Google Scholar
  23. Fischer GH, Molenaar IW. Rasch models – foundations, recent developments, and applications. 1st ed. Berlin: Springer; 1995.Google Scholar
  24. Galindo-Garre F, Hidalgo MD, Guilera G, Pino O, Rojo JE, Gomez-Benito J. Modeling the World Health Organization Disability Assessment Schedule II using non-parametric item response models. Int J Methods Psychiatr Res. 2015;24:1–10.CrossRefPubMedGoogle Scholar
  25. Gardner W, Kelleher KJ, Pajer KA. Multidimensional adaptive testing for mental health problems in primary care. Med Care. 2002;40:812–23.CrossRefPubMedGoogle Scholar
  26. Glas CAW. Modification indices for the 2-PL and the nominal response model. Psychometrika. 1999;64:273–94.CrossRefGoogle Scholar
  27. Glas CAW, Verhelst ND. Tests of fit for polytomous Rasch models. In: Fischer GH, Molenaar IW, editors. Rasch models – foundations, recent developments, and applications. Berlin: Springer; 1995. p. 325–52.Google Scholar
  28. Haley SM, McHorney CA, Ware Jr JE. Evaluation of the MOS SF-36 physical functioning scale (PF-10): I. Unidimensionality and reproducibility of the Rasch item scale. J Clin Epidemiol. 1994;47:671–84.CrossRefPubMedGoogle Scholar
  29. Hu LT, Bentler PM. Cutoff criteria for fit indices in covariance structure analysis: conventional criteria versus new alternatives. Struct Eq Model. 1999;6:1–55.CrossRefGoogle Scholar
  30. Khan A, Lindenmayer JP, Opler M, Yavorsky C, Rothman B, Lucic L. A new Integrated Negative Symptom structure of the Positive and Negative Syndrome Scale (PANSS) in schizophrenia using item response analysis. Schizophr Res. 2013;150:185–96.CrossRefPubMedGoogle Scholar
  31. Kreiner S, Christensen KB. Analysis of local dependence and multidimensionality in graphical loglinear Rasch models. Commu Stat Theory Methods. 2004;33:1239–76.CrossRefGoogle Scholar
  32. Laurens KR, Hobbs MJ, Sunderland M, Green MJ, Mould GL. Psychotic-like experiences in a community sample of 8000 children aged 9 to 11 years: an item response theory analysis. Psychol Med. 2012;42:1495–506.CrossRefPubMedGoogle Scholar
  33. Liu Y, Thissen D. Comparing score tests and other local dependence diagnostics for the graded response model. Br J Math Stat Psychol. 2014;67:496–513.CrossRefPubMedGoogle Scholar
  34. Masters GN, Wright BD. The partial credit model. In: van der Linden WJ, Hambleton RK, editors. Handbook of modern item response theory. Berlin: Springer; 1997. p. 101–22.CrossRefGoogle Scholar
  35. Michel P, Baumstarck K, Auquier P, Amador X, Dumas R, Fernandez J, et al. Psychometric properties of the abbreviated version of the scale to assess unawareness in mental disorder in schizophrenia. BMC Psychiatry. 2013;13:1–10.Google Scholar
  36. Michel P, Auquier P, Baumstarck K, Loundou A, Ghattas B, Lancon C, et al. How to interpret multidimensional quality of life questionnaires for patients with schizophrenia? Qual Life Res. 2015;24:2483–92.CrossRefPubMedGoogle Scholar
  37. Mojtabai R, Corey-Lisle PK, Ip EH, Kopeykina I, Haeri S, Cohen LJ, et al. The patient assessment questionnaire: initial validation of a measure of treatment effectiveness for patients with schizophrenia and schizoaffective disorder. Psychiatry Res. 2012;200:857–66.CrossRefPubMedGoogle Scholar
  38. Mokken RJ. A theory and procedure of scale analysis. Berlin: Mouton; 1971.CrossRefGoogle Scholar
  39. Muraki E. A generalized partial credit model. In: van der Linden WJ, Hambleton RK, editors. Handbook of modern item response theory. Berlin: Springer; 1997. p. 153–64.CrossRefGoogle Scholar
  40. Muthén BO. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika. 1984;29:177–85.Google Scholar
  41. Muthén BO, Muthén L. Mplus User’s guide (version 7) [Computer software]. Los Angeles: Muthén & Muthén; 2014.Google Scholar
  42. Norholm V, Bech P. Quality of life in schizophrenic patients: association with depressive symptoms. Nord J Psychiatry. 2006;60:32–7.CrossRefPubMedGoogle Scholar
  43. Orlando M, Thissen D. Further investigation of the performance of S – X2: an item fit index for use with dichotomous item response theory models. Appl Psychol Meas. 2003;27:289–98.CrossRefGoogle Scholar
  44. Orlando M, Sherbourne CD, Thissen D. Summed-score linking using item response theory: application to depression measurement. Psychol Assess. 2000;12:354–9.CrossRefPubMedGoogle Scholar
  45. Østergaard SD, Lemming OM, Mors O, Correll CU, Bech P. PANSS-6: a valid, brief rating scale for the measurement of symptom severity and cross-sectional remission in schizophrenia. Acta Psychiatrica Scandinavica. 2015. doi: 10.1111/acps.12526.Google Scholar
  46. Pan AW, Chung L, Fife BL, Hsiung PC. Evaluation of the psychometrics of the social impact scale: a measure of stigmatization. Int J Rehabil Res. 2007;30:235–8.CrossRefPubMedGoogle Scholar
  47. Park IJ, Jung DC, Hwang SS, Jung HY, Yoon JS, Kim CE, et al. Refinement of the SWN-20 based on the Rasch rating model. Compr Psychiatry. 2015;60:134–41.CrossRefPubMedGoogle Scholar
  48. Ramsay JO. Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika. 1991;56:611–30.CrossRefGoogle Scholar
  49. Rasch G. Probabilistic models for some intelligence and attainment tests. 2nd ed. Chicago: University of Chicago Press; 1980.Google Scholar
  50. Reeve BB, Hays RD, Bjorner JB, Cook KF, Crane PK, Teresi JA, et al. Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Med Care. 2007;45:S22–31.CrossRefPubMedGoogle Scholar
  51. Reise SP, Morizot J, Hays RD. The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Qual Life Res. 2007;16 Suppl 1:19–31.CrossRefPubMedGoogle Scholar
  52. Reise SP, Horan WP, Blanchard JJ. The challenges of fitting an item response theory model to the social anhedonia scale. J Pers Assess. 2011a;93:213–24.CrossRefPubMedPubMedCentralGoogle Scholar
  53. Reise SP, Ventura J, Keefe RS, Baade LE, Gold JM, Green MF, et al. Bifactor and item response theory analyses of interviewer report scales of cognitive impairment in schizophrenia. Psychol Assess. 2011b;23:245–61.CrossRefPubMedPubMedCentralGoogle Scholar
  54. Rosenbaum PR. Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika. 1984;49:425–35.CrossRefGoogle Scholar
  55. Samejima F. Estimation of latent ability using a response pattern of graded scores. Psy Mono Suppl. 1969;17:1–97.Google Scholar
  56. Samejima F. Graded response model. In: van der Linden WJ, Hambleton RK, editors. Handbook of modern item response theory. Berlin: Springer; 1997. p. 85–100.CrossRefGoogle Scholar
  57. Santor DA, Ascher-Svanum H, Lindenmayer JP, Obenchain RL. Item response analysis of the positive and negative syndrome scale. BMC Psychiatry. 2007;7:66.CrossRefPubMedPubMedCentralGoogle Scholar
  58. Sijtsma K, Hemker BT. Nonparametric polytomous IRT models for invariant item ordering, with results for parametric models. Psychometrika. 1998;63:183–200.CrossRefGoogle Scholar
  59. Smith RM, Plackner C. The family approach to assessing fit in Rasch measurement. J Appl Meas. 2008;10:424–37.Google Scholar
  60. Stochl J, Jones, PB, Perez J, Khandaker GM, Böhnke JR, Croudace TJ. (2015). Effects of ignoring clustered data structure in confirmatory factor analysis of ordered polytomous items: a simulation study based on PANSS. Int J Methods Psychiatr Res.Google Scholar
  61. Stout W, Habing B, Douglas J, Kim RH, Roussos L, Zhang J. Conditional covariance-based nonparametric multidimensionality assessment. Psychol Meas. 2001;20:331–54.CrossRefGoogle Scholar
  62. Tennant A, Conaghan PG. The Rasch measurement model in rheumatology: what is it and why use it? When should it be applied, and what should one look for in a Rasch paper? Arthritis Rheum. 2007;57:1358–62.CrossRefPubMedGoogle Scholar
  63. Teresi JA, Cross PS, Golden RR. Some applications of latent trait analysis to the measurement of ADL. J Gerontol. 1989;44:S196–204.CrossRefPubMedGoogle Scholar
  64. van den Berg SM, Paap MC, Derks EM. Using multidimensional modeling to combine self-report symptoms with clinical judgment of schizotypy. Psychiatry Res. 2013;206:75–80.CrossRefPubMedGoogle Scholar
  65. van der Linden WJ, Hambleton RK. Handbook of modern item response theory. Berlin: Springer; 1997.CrossRefGoogle Scholar
  66. Veit CL, Ware Jr JE. The structure of psychological distress and well-being in general populations. J Consult Clin Psychol. 1983;51:730–42.CrossRefPubMedGoogle Scholar
  67. Verhelst ND, Verstralen HHFM. An IRT model for multiple raters. In: Boomsma A, van Duijn M, Snijders T, editors. Essays on item response theory. New York: Springer; 2001. p. 89–108.CrossRefGoogle Scholar
  68. Ware Jr JE, Kosinski M, Bjorner JB, Turner-Bowker DM, Maruish M. SF-36 health survey. Manual and interpretation guide. 2nd ed. Lincoln: QualityMetric Incorporated; 2007.Google Scholar
  69. Warm TA. Weighted likelihood estimation of ability in item response theory. Psychometrika. 1989;54:427–50.CrossRefGoogle Scholar
  70. Zumbo BD. A handbook on the theory and methods of Differential Item Functioning (DIF): logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Ottawa: Directorate of Human Resources Research and Evaluation, Department of National Defense; 1999.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Patient InsightsOptumLincolnUSA
  2. 2.Department of Public HealthUniversity of CopenhagenCopenhagenDenmark
  3. 3.National Research Centre for the Working EnvironmentCopenhagenDenmark
  4. 4.Psychiatric Research Unit, CCMHMental Health Centre North ZealandCopenhagenDenmark

Personalised recommendations