Quality of Life Research

, Volume 15, Issue 3, pp 331–348 | Cite as

Assessment of Differential Item Functioning for Demographic Comparisons in the MOS SF-36 Health Survey

  • Anthony J. Perkins
  • Timothy E. Stump
  • Patrick O. Monahan
  • Colleen A. McHorney


Objective: To investigate whether items of the Medical Outcomes Study (MOS) 36-Item Short-Form Health Status Survey (SF-36) exhibited differential item functioning (DIF) with respect to age, education, race, and gender. Methods: The data for this study come from two large national datasets, the MOS and the 1990 National Survey of Functional Health Status (NSFHS). We used logistic regression to identify items exhibiting DIF. Results: We found DIF to be most problematic for age comparisons. Items flagged for age DIF were vigorous activities, bend/kneel/stoop, bathing or dressing, limited in kind of work, health in general, get sick easier than others, expect health to get worse, felt calm and peaceful, and all four vitality items. Items flagged for education DIF include vigorous activities, health in general, health is excellent, felt calm and peaceful, and been a happy person. Vigorous activities, walk more than a mile, health in general, and expect health to get worse were identified as DIF when comparing African-Americans with whites. No items were identified for gender DIF. Conclusions: We found several consistent patterns of DIF using two national datasets with different population characteristics. In the current study, the effect of DIF rarely transferred to the scale level. Further research is needed to corroborate these results and determine qualitatively why DIF may occur for these specific items.


Differential item functioning Item bias SF-36 Quality of life 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ware, JE,Jr, Sherbourne, CD 1992The MOS 36-item short-form health survey (SF-36). I. Conceptual framework and item selectionMed Care30473483PubMedGoogle Scholar
  2. 2.
    Ware, JE,Jr, Bayliss, MS, Rogers, WH, Kosinski, M, Tarlov, AR 1996Differences in 4-year health outcomes for elderly and poor, chronically ill patients treated in HMO and fee-for-service systems. Results from the Medical Outcomes StudyJ Am Med Assoc27610391047CrossRefGoogle Scholar
  3. 3.
    Williams, JW,Jr, Barrett, J, Oxman, T, Frank, E, Katon, W, Sullivan, M, Cornell, J, Sengupta, A 2000Treatment of dysthymia and minor depression in primary care: A randomized controlled trial in older adultsJ Am Med Assoc28415191526Sep 27CrossRefGoogle Scholar
  4. 4.
    Lederle, FA, Johnson, GR, Wilson, SE, Acher, CW, Ballard, DJ, Littooy, FN, Messina, LM 2003Aneurysm Detection and Management Veterans Affairs Cooperative Study. Quality of life, impotence, and activity level in a randomized trial of immediate repair versus surveillance of small abdominal aortic aneurysmJ Vasc Surg38745752CrossRefPubMedGoogle Scholar
  5. 5.
    Sarmiento, JM, Farnell, MB, Nagorney, DM, Hodge, DO, Harrington, JR 2004Quality-of-life assessment of surgical reconstruction after laparoscopic cholecystectomy-induced bile duct injuries: what happens at 5 years and beyond?Arch Surg139483488PubMedGoogle Scholar
  6. 6.
    Kuppermann, M Varner, RE Summitt, RL,Jr Learman, LA Ireland, C Vittinghoff, E Stewart, AL Lin, F Richter, HE Showstack, J Hulley, SB Washington, AE Ms Research Group2004Effect of hysterectomy vs medical treatment on health-related quality of life and sexual functioning: the medicine or surgery (Ms) randomized trialJ Am Med Assoc29114471455CrossRefGoogle Scholar
  7. 7.
    Bhandari, M, Sprague, S, Hanson, B, Busse, JW, Dawe, DE, Moro, JK, Guyatt, GH 2004Health-related quality of life following operative treatment of unstable ankle fractures: A prospective observational studyJ Orthop Trauma18338345PubMedGoogle Scholar
  8. 8.
    Hemingway, H, Stafford, M, Stansfeld, S, Shipley, M, Marmot, M 1997Is the SF-36 a valid measure of change in population health? Results from the Whitehall II StudyBrit Med J31512731279PubMedGoogle Scholar
  9. 9.
    McHorney, C, Ware, J, Raczek, A 1993The MOS 36-item short-form health survey (SF-36): II. Psychometric and clinical tests of validity in measuring physical and mental health constructsMed Care31247263PubMedGoogle Scholar
  10. 10.
    McHorney, CA, Ware, JE,Jr, Lu, JF, Sherbourne, CD 1994The MOS 36-item Short-Form Health Survey (SF-36): III. Tests of data quality, scaling assumptions, and reliability across diverse patient groupsMed Care324066PubMedGoogle Scholar
  11. 11.
    Wolinsky, FD, Stump, TE 1996A measurement model of the Medical Outcomes Study 36-Item Short-Form Health Survey in a clinical sample of disadvantaged, older, black, and white men and womenMed Care34537548CrossRefPubMedGoogle Scholar
  12. 12.
    Gandek, B, Ware, JE,Jr, Aaronson, NK, Alonso, J, Apolone, G, Bjorner, J, Brazier, J, Bullinger, M, Fukuhara, S, Kaasa, S, Leplege, A, Sullivan, M 1998Tests of data quality, scaling assumptions, and reliability of the SF-36 in eleven countries: Results from the IQOLA Project. International Quality of Life AssessmentJ Clin Epidemiol5111491158PubMedGoogle Scholar
  13. 13.
    Kosinski, M, Keller, SD, Ware, JE,Jr, Hatoum, HT, Kong, SX 1999The SF-36 Health Survey as a generic outcome measure in clinical trials of patients with osteoarthritis and rheumatoid arthritis: Relative validity of scales in relation to clinical measures of arthritis severityMed Care37MS23MS39PubMedGoogle Scholar
  14. 14.
    Failde, I, Ramos, I 2000Validity and reliability of the SF-36 Health Survey Questionnaire in patients with coronary artery diseaseJ Clin Epidemiol53359365CrossRefPubMedGoogle Scholar
  15. 15.
    Apolone, G, Filiberti, A, Cifani, S, Ruggiata, R, Mosconi, P 1998Evaluation of the EORTC QLQ-C30 questionnaire: A comparison with SF-36 Health Survey in a cohort of Italian long-survival cancer patientsAnn Oncol9549557CrossRefPubMedGoogle Scholar
  16. 16.
    Skevington, SM, Carse, MS, Williams, AC 2001Validation of the WHOQOL-100: Pain management improves quality of life for chronic pain patientsClin J Pain17264275CrossRefPubMedGoogle Scholar
  17. 17.
    Jacobs, JE, Maille, AR, Akkermans, RP, Weel Cvan, , Grol, RP 2004Assessing the quality of life of adults with chronic respiratory diseases in routine primary care: Construction and first validation of the 10-Item Respiratory Illness Questionnaire-monitoring 10 (RIQ-MON10)Qual Life Res1311171127CrossRefPubMedGoogle Scholar
  18. 18.
    Rumsfeld, J, MaWhinney, S, McCarthy, M, Shroyer, A, VillaNueva, C, O’Brien, M,  et al. 1999Health-related quality of life as a predictor of mortality following coronary artery bypass graft surgeryJ Am Med Assoc28112981303CrossRefGoogle Scholar
  19. 19.
    Fan, V, Curtis, J, Tu, S, McDonell, M, Fihn, S 2002Using quality of life to predict hospitalization and mortality in patients with obstructive lung diseasesChest122429436CrossRefPubMedGoogle Scholar
  20. 20.
    Lowrie, EG, Curtin, RB, LePain, N, Schatell, D 2003Medical outcomes study short form-36: A consistent and powerful predictor of morbidity and mortality in dialysis patientsAm J Kidney Dis4112861292CrossRefPubMedGoogle Scholar
  21. 21.
    Fan, VS, Au, DH, McDonell, MB, Fihn, SD 2004 MarIntraindividual change in SF-36 in ambulatory clinic primary care patients predicted mortality and hospitalizationsJ Clin Epidemiol57277283CrossRefGoogle Scholar
  22. 22.
    Thissen, D, Steinberg, L, Wainer, H 1988Use of Item Response Theory in the Study of Group Differences in Trace LinesWainer, HBraun, J eds. Test ValidityLawrence Erlbaum AssociatesHillsdale, NJ147169Google Scholar
  23. 23.
    Teresi, J, Cross, P, Golden, R 1989Some applications of latent trait analysis to the measurement of ADLJ Gerontol44S196S204PubMedGoogle Scholar
  24. 24.
    Ellis, BB, Minsel, B, Becker, P 1989Evaluation of attitude survey translations: An investigation using item response theoryInt J Psychol24665684Google Scholar
  25. 25.
    Stommel, M, Given, B, Given, C, Kalaian, H, Schulz, R, McCorkle, R 1993Gender bias in the measurement properties of the Center for Epidemiologic Studies Depression Scale (CES-D)Psychiat Res49239250CrossRefGoogle Scholar
  26. 26.
    Huang, C, Church, A, Katigbak, M 1997Identifying cultural differences in items and traitsJ Cross-Cult Psychol28192218Google Scholar
  27. 27.
    Collins, WC, Raju, NS, Edwards, JE 2000Assessing differential functioning in a satisfaction scaleJ Appl Psychol85451461CrossRefPubMedGoogle Scholar
  28. 28.
    Larson, CO, Colangelo, M, Goods, K 1998Black–white differences in health perceptions among the indigentJ Ambul Care Manage213543PubMedGoogle Scholar
  29. 29.
    Regidor, E, Barrio, G, Fuente, L, Domingo, A, Rodriguez, C, Alonso, J 1999Association between educational level and health related quality of life in Spanish adultsJ␣Epidemiol Community Health537582PubMedGoogle Scholar
  30. 30.
    Scott, KM, Tobias, MI, Sarfati, D, Haslett, SJ 1999SF-36 health survey reliability, validity and norms for New ZealandAust N Z J Public Health23401406PubMedGoogle Scholar
  31. 31.
    Jackson-Triche, ME, Greer Sullivan, J, Wells, KB, Rogers, W, Camp, P, Mazel, R 2000Depression and health-related quality of life in ethnic minorities seeking care in general medical settingsJ Affect Disord588997CrossRefPubMedGoogle Scholar
  32. 32.
    Walters, SJ, Munro, JF, Brazier, JE 2001Using the SF-36 with older adults: A cross-sectional community-based surveyAge Ageing30337343CrossRefPubMedGoogle Scholar
  33. 33.
    Bjorner, JB, Kreiner, S, Ware, JE, Damsgaard, MT, Bech, P 1998Differential item functioning in the Danish translation of the SF-36Clin Epidemiol5111891202Google Scholar
  34. 34.
    Moorer, P, Suurmeije, ThP, Foets, M, Molenaar, IW 2001Psychometric properties of the RAND-36 among three chronic diseases (multiple sclerosis, rheumatic diseases and COPD) in The NetherlandsQual Life Res10637645CrossRefPubMedGoogle Scholar
  35. 35.
    Fleishman, JA, Lawrence, WF 2003Demographic variation in SF-12 scores: True differences or differential item functioning?Med Care41III75III86PubMedGoogle Scholar
  36. 36.
    Bjorner, J, Kristensen, TS 1999Multi-item scales for measuring global self-rated health: Investigation of construct validity using structural equations modelsRes Aging21417439Google Scholar
  37. 37.
    Wolfe, F, Hawley, DJ, Goldenberg, DL, Russell, IJ, Buskila, D, Neumann, L 2000The assessment of functional impairment in fibromyalgia (FM): Rasch analyses of 5 functional scales and the development of the FM Health Assessment QuestionnaireJ Rheumatol2719891999PubMedGoogle Scholar
  38. 38.
    Tarlov, A, Ware, J, Greenfield, S, Nelson, E, Perrin, E, Zubkoff, M 1989The Medical Outcomes Study: An application of methods for monitoring the results of medical careJ Am Med Assoc262925930CrossRefGoogle Scholar
  39. 39.
    Stewart, A, Greenfield, S, Rogers, W, Berry, S, McGlynn, E, Ware, J 1989Functional status and well-being of patients with chronic conditions: Results from the Medical Outcomes StudyJ Am Med Assoc262907913CrossRefGoogle Scholar
  40. 40.
    Wells, K, Burnam, M, Rogers, W 1989The functioning and well-being of depressed patients: Results from the Medical Outcomes StudyJ Am Med Assoc262914919Google Scholar
  41. 41.
    Kravitz, R, Greenfield, S, Rogers, W, Manning, W, Zubkoff, M 1992Differences in the mix of patients among medical specialties and systems of care: Results from the Medical Outcomes StudyJ Am Med Assoc26716171623CrossRefGoogle Scholar
  42. 42.
    Katz, DA, McHorney, CA, Atkinson, RL 2000Impact of obesity on health-related quality of life in patients with chronic illnessJ Gen Intern Med15789796CrossRefPubMedGoogle Scholar
  43. 43.
    McHorney, CA, Kosinski, M, Ware, JE,Jr 1994Comparisons of the costs and quality of norms for the SF-36 health survey collected by mail versus telephone interview: results from a national surveyMed Care32551567PubMedGoogle Scholar
  44. 44.
    Hambleton, R, Swaminathan, H, Rogers, H 1991Fundamentals of Item Response TheorySage PublicationsNewbury Park, CAGoogle Scholar
  45. 45.
    Lord, FM 1980Applications of Item Response Theory to Practical Testing ProblemsLawrence Erlbaum AssociatesHillsdale, NJGoogle Scholar
  46. 46.
    Hattie, J 1984An empirical study of various indices for determining unidimensionalityMultivar Behav Res194978Google Scholar
  47. 47.
    Camilli, G, Shephard, L 1994Methods of Identifying Biased Test ItemsSAGEThousand Oaks, CAGoogle Scholar
  48. 48.
    Holland, PW, Thayer, DT 1988Differential Item Performance and the Mantel-Haenszel ProcedureWainer, HBraun, J eds. Test ValidityLawrence Erlbaum AssociatesHillsdale, NJ129145Google Scholar
  49. 49.
    Zwick, R 1990When do item response function and Mantel–Haenszel definitions of differential item functioning coincide?J Educ Stat15185197Google Scholar
  50. 50.
    Lewis, C 1993A note on the value of including the studied item in the test score when analyzing test items for DIFHolland, PWainer, H eds. Differential Item FunctioningLawrence Erlbaum AssociatesHillsdale, NJ317319Google Scholar
  51. 51.
    Swaminathan, H, Rogers, HJ 1990Detecting differential item functioning using logistic regression proceduresJ Educ Meas27361370Google Scholar
  52. 52.
    Hidalgo-Montesinos, MD, Gómez-Benito, J 2003Test Purification and the evaluation of differential item functioning with multinomial logistic regressionEur J Psychol Assess19111Google Scholar
  53. 53.
    Dorans, N, Holland, P 1993DIF Detection and Description: Mantel-Haenszel and StandardizationHolland, PWainer, H eds. Differential Item FunctioningLawrence Erlbaum AssociatesHillsdale, NJ3566Google Scholar
  54. 54.
    Dorans, NJ, Kulick, E 1986Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude TestJ Educ Meas23355368Google Scholar
  55. 55.
    Dorans NJ, Schmitt AP. Constructed response and differential item functioning: A pragmatic approach. Research Report No. 91–47, Princeton, NJ, ETS 1991Google Scholar
  56. 56.
    Raju, N, Ellis, B 2002Differential Item and Test FunctioningDrasgow, FSchmitt, N eds. Measuring and Analyzing Behavior in Organizations: Advances in Measurement and Data AnalysisJossey-BassSan Francisco, CA157188Google Scholar
  57. 57.
    Yang, EH, Hla, KM, McHorney, CA, Havighurst, T, Badr, MS, Weber, S 2000Sleep apnea and quality of lifeSleep23535541PubMedGoogle Scholar
  58. 58.
    Stewart, AL, Ware, JE 1992Measuring Functioning and Well-being: The Medical Outcomes Study ApproachDuke University PressDurham, NCGoogle Scholar
  59. 59.
    Schuur, WH, Kiers, HAL 1994Why factor analysis often is the incorrect model for analyzing bipolar concepts, and what model to use insteadAppl Psychol Meas1897110Google Scholar
  60. 60.
    Jylha, M 1994Self-rated health revisited: Exploring survey interview episodes with elderly respondentsSoc Sci Med39983990PubMedGoogle Scholar
  61. 61.
    Krause, NM, Jay, GM 1994What do global self-rated health items measure?Med Care32930942PubMedGoogle Scholar
  62. 62.
    Shadbolt, B 1997Some correlates of self-rated health for Australian womenAm J Public Health87951956PubMedGoogle Scholar
  63. 63.
    KE, Ryan, Chiu, S 2001An examination of item context effects, DIF, and gender DIFAppl Meas Educ147390Google Scholar
  64. 64.
    Roussos, L, Stout, W 1996Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and Mantel–Haenszel type I error performanceJ Educ Meas3215230Google Scholar
  65. 65.
    Flannery, W, Reise, S, Widaman, K 1995An item response theory analysis of the general and academic scales of the Self-Description Questionnaire IIJ Res Pers29168188CrossRefGoogle Scholar
  66. 66.
    Waller, N, Thompson, J, Wenk, E 2000Using IRT to separate measurement bias from true group differences on homogeneous and heterogeneous scale: An illustration with the MMPIPsychol Meth5125146Google Scholar
  67. 67.
    Scheuneman, J 1987An experimental, exploratory study of causes of bias in test itemsJ Educ Meas2497118Google Scholar
  68. 68.
    Ellis, BB 1990Assessing intelligence cross-nationally: A case for differential item functioning detectionIntelligence146178CrossRefGoogle Scholar
  69. 69.
    Iwata, N, Turner, J, Lloyd, D 2002Race/ethnicity and depressive symptoms in community-dwelling young adults: A differential item functioning analysisPsychiat Res110281289CrossRefGoogle Scholar
  70. 70.
    Petersen, MA, Groenvold, M, Bjorner, JB, Aaronson, N, Conroy, T, Cull, A, Fayers, P, Hjermstad, M, Sprangers, M, Sullivan, M 2003Use of differential item functioning analysis to assess the equivalence of translations of a questionnaireQual Life Res12373385CrossRefPubMedGoogle Scholar
  71. 71.
    Teresi, JA, Golden, RR, Cross, P, Gurland, B, Kleinman, M, Wilder, D 1995Item bias in cognitive screening measures: Comparisons of elderly White, Afro-American, Hispanic and high and low education subgroupsJ Clin Epidemiol48473483CrossRefPubMedGoogle Scholar
  72. 72.
    Wainer, H, Lukhele, R 1997Managing the influence of DIF from big items: The 1988 advanced placement history test as an exampleAppl Meas Educ10201215Google Scholar
  73. 73.
    McHorney, C 2003Ten recommendations for advancing patient-centered outcomes measurement for older personsAnn Intern Med139403409PubMedGoogle Scholar
  74. 74.
    Aday, LA, Anderson, R 1977Standard measures of standard variablesReeder, L.G eds. Health Survey Research Methods: Second Biennial ConferenceNational Center for Health Services ResearchWashington, D.C.6366Google Scholar

Copyright information

© Springer 2006

Authors and Affiliations

  • Anthony J. Perkins
    • 1
    • 2
  • Timothy E. Stump
    • 1
    • 2
  • Patrick O. Monahan
    • 3
  • Colleen A. McHorney
    • 1
    • 2
    • 3
    • 4
  1. 1.Indiana University Center for Aging ResearchIndianapolis
  2. 2.Regenstrief Institute, Inc.Indianapolis
  3. 3.Department of MedicineIndiana University School of MedicineIndianapolis
  4. 4.Roudebush VA Medical Center HSR&DIndianapolis

Personalised recommendations