Quality of Life Research

, Volume 22, Issue 5, pp 1135–1144 | Cite as

Measurement invariance of the SF-12 across European-American, Latina, and African-American postpartum women

  • Tamer F. Desouky
  • Pablo A. Mora
  • Elizabeth A. Howell



The purpose of this study was to determine whether a postpartum-specific version of the SF-12 was invariant across three ethnic groups. Specifically, we examined the presence of differential item functioning (DIF) among European-American, Latina, and African-American mothers. DIF refers to differential endorsement of item responses that are not due to the construct being measured. DIF can result in biased group comparisons.


We analyzed cross-sectional data of postpartum women (n = 655) who delivered at an urban hospital in the northeast region of the USA. Multiple indicators multiple causes (MIMIC) model was used to examine differential item functioning.


The analyses revealed the presence of DIF for three items: Item 1 “self-assessed general health,” item 8 “bodily pain,” and item 9 “calm and peaceful.” Only two DIF effects were meaningful based on odds ratios and on the percentage of the total effect accounted for by the DIF effect. Specifically, African-American women differentially endorsed item 8 “bodily pain” when compared to European-American women (OR = 2.11, CI95 = 1.20, 3.71) and Latinas were more likely to endorse item 9 “calm and peaceful” when compared to European-American women (OR = 2.62, CI95 = 1.64, 4.17).


The results of this study indicate that the SF-12 is to a great degree an invariant measure for the assessment of HRQoL among postpartum ethnically diverse women. More research is needed to examine other aspects of invariance (e.g., configural and metric) and longitudinal invariance in ethnically diverse samples. To better understand ethnic differences in health, future studies need to examine the factors that may underlie DIF effects in quality of life.


SF-12 Structural equation modeling (SEM) Multiple indicators multiple causes modeling (MIMIC) Differential item functioning (DIF) Postpartum Quality of life 



Short-form health survey


Multiple indicator multiple causes


Differential item functioning


Health-related quality of life


Maternity outcomes project


Physical component summary


Mental health component summary


Structural equation modeling


Confirmatory factor analysis


Weighted least squares mean and variance adjusted estimator


Maximum likelihood estimator with robust standard errors estimator


Missing completely at random


Odds ratio


Chi-square test


Comparative fit index


Tucker–Lewis index


Root mean square error of approximation


Weighted root mean square residual



This study was based on the first author’s master’s thesis. This research was supported by Agency for Healthcare Research and Quality (RO1 HS09698-3) and Robert Wood Johnson Foundation Grant (42680).


  1. 1.
    Smith, J. A. (1981). The idea of health: A philosophical inquiry. ANS Advances in Nursing Science, 3, 43–50.PubMedGoogle Scholar
  2. 2.
    Ryff, C. D. (1989). Happiness is everything, or is it? Explorations on the meaning of psychological well-being. Journal of Personality and Social Psychology, 57, 1069–1081.CrossRefGoogle Scholar
  3. 3.
    Leventhal, H., & Colman, S. (1997). Quality of life: A process view. Psychology & Health, 12, 753–767.CrossRefGoogle Scholar
  4. 4.
    Teresi, J. A. (2006). Overview of quantitative measurement methods: Equivalence, invariance, and differential item functioning in health applications. Medical Care, 44, 39–49.CrossRefGoogle Scholar
  5. 5.
    Webster, J., Nicholas, C., Velacott, C., Cridland, N., & Fawcett, L. (2011). Quality of life and depression following childbirth: Impact of social support. Midwifery, 27, 745–749.PubMedCrossRefGoogle Scholar
  6. 6.
    Razurel, C., Bruchon-Schweitzer, M., Dupanloup, A., Irion, O., & Epiney, M. (2011). Stressful events, social support and coping strategies of primiparous women during the postpartum period: A qualitative study. Midwifery, 27, 237–242.PubMedCrossRefGoogle Scholar
  7. 7.
    Thompson, J. F., Roberts, C. L., Currie, M., & Ellwood, D. A. (2002). Prevalence and persistence of health problems after childbirth: Associations with parity and method of birth. Birth, 29, 83–94.PubMedCrossRefGoogle Scholar
  8. 8.
    Hopkins, J., & Campbell, S. B. (2008). Development and validation of a scale to assess social support in the postpartum period. Archives of Women’s Mental Health, 11, 57–65.PubMedCrossRefGoogle Scholar
  9. 9.
    Declercq, E., Cunningham, D. K., Johnson, C., & Sakala, C. (2008). Mothers’ reports of postpartum pain associated with vaginal and cesarean deliveries: Results of a national survey. Birth, 1, 16–24.CrossRefGoogle Scholar
  10. 10.
    Webb, D. A., Bloch, J. R., Coyne, J. C., Chung, E. K., Bennett, I. M., & Culhane, J. F. (2008). Postpartum physical symptoms in new mothers: Their relationship to functional limitations and emotional well-being. Birth, 35, 179–187.PubMedCrossRefGoogle Scholar
  11. 11.
    Boulvain, M., Perneger, T. V., Othenin-Girard, V., Petrou, S., Berner, M., & Irion, O. (2004). Home-based versus hospital-based postnatal care: A randomised trial. An International Journal of Obstetrics & Gynaecology, 111(8), 807–813.CrossRefGoogle Scholar
  12. 12.
    Handa, V. L., Zyczynski, H. M., Burgio, K. L., Fitzgerald, M. P., Borello-France, D., Janz, N. K., et al. (2007). The impact of fecal and urinary incontinence on quality of life 6 months after childbirth. American Journal of Obstetrics and Gynecology, 197(6), 636.e1–636.e6.CrossRefGoogle Scholar
  13. 13.
    McGovern, P., Dowd, B., Gjerdingen, D., Gross, C. R., Kenney, S., Ukestad, L., et al. (2006). Postpartum health of employed mothers 5 weeks after childbirth. Annals of Family Medicine, 4(2), 159–167.PubMedCrossRefGoogle Scholar
  14. 14.
    McGovern, P., Dowd, B., Gjerdingen, D., Dagher, R., Ukestad, L., McCaffrey, D., et al. (2007). Mothers’ health and work-related factors at 11 weeks postpartum. Annals of Family Medicine, 5(6), 519–527.PubMedCrossRefGoogle Scholar
  15. 15.
    Sword, W., Watt, S., Krueger, P., Thabane, L., Landy, C. K., Farine, D., et al. (2009). The ontario mother and infant study (TOMIS) III: A multi-site cohort study of the impact of delivery method on health, service use, and costs of care in the first postpartum year. BMC Pregnancy and Childbirth, 9, 16.PubMedCrossRefGoogle Scholar
  16. 16.
    Steinberg, L., & Thissen, D. (2006). Using effect sizes for research reporting: Examples using item response theory to analyze differential item functioning. Psychological Methods, 11, 402–415.PubMedCrossRefGoogle Scholar
  17. 17.
    Lewis, T. T., Yang, F. M., Jacobs, E. A., & Fitchett, G. (2012). Racial/Ethnic differences in responses to the everyday discrimination scale: A differential item functioning analysis. American Journal of Epidemiology, 175, 391–401.PubMedCrossRefGoogle Scholar
  18. 18.
    Gallo, J. J., Anthony, J. C., & Muthén, B. O. (1994). Age differences in the symptoms of depression: A latent trait analysis. Journal of Gerontology: Psychological Sciences, 49, 251–264.Google Scholar
  19. 19.
    Hagell, P., & Westergren, A. (2011). Measurement properties of the SF-12 health survey in Parkinson’s disease. Journal of Parkinson’s Disease, 1(2), 185–196.Google Scholar
  20. 20.
    Fleishman, J. A., & Lawrence, W. F. (2003). Demographic variation in SF-12 scores: True differences or differential item functioning? Medical Care, 41, 75–86.Google Scholar
  21. 21.
    Teresi, J. A., Ocepek-Welikson, K., Kleinman, M., Cook, K. F., Crane, P. K., Gibbons, L. E., et al. (2007). Evaluating measurement equivalence using the item response theory log-likelihood ratio (IRTLR) method to assess differential item functioning (DIF): Applications (with illustrations) to measures of physical functioning ability and general distress. Quality of Life Research, 16(Supplement 1), 43–68.PubMedCrossRefGoogle Scholar
  22. 22.
    Perkins, A. J., Stump, T. E., Monahan, P. O., & McHorney, C. A. (2006). Assessment of differential item functioning for demographic comparisons in the MOS SF-36 health survey. Quality of Life Research, 15(3), 331–348. doi: 10.1007/s11136-005-1551-6.PubMedCrossRefGoogle Scholar
  23. 23.
    Howell, E., Mora, P. A., Horowitz, C. R., & Leventhal, H. (2005). Racial and ethnic differences in factors associated with early postpartum depressive symptoms. Obstetrics Gynecology, 105, 1442–1450.PubMedCrossRefGoogle Scholar
  24. 24.
    Howell, E., Mora, P. A., & Leventhal, H. (2006). Correlates of early postpartum depression. Maternal and Child Health Journal, 10, 149–157.PubMedCrossRefGoogle Scholar
  25. 25.
    Howell, E., Mora, P. A., DiBonaventura, M. D., & Leventhal, H. (2009). Modifiable factors associated with changes in postpartum depressive symptoms. Archives of Women’s Health, 12, 113–120.CrossRefGoogle Scholar
  26. 26.
    Ware, J., Kosinski, M., & Keller, S. D. (1996). A 12-item short-form health survey: Construction of scales and preliminary tests of reliability and validity. Medical Care, 34, 220–233.PubMedCrossRefGoogle Scholar
  27. 27.
    Teresi, J. A. (2006). Different approaches to differential item functioning in health applications. Advantages, disadvantages and some neglected topics. Medical Care, 44, 152–170.CrossRefGoogle Scholar
  28. 28.
    Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford Press.Google Scholar
  29. 29.
    Fleishman, J. A., Spector, W. D., & Altman, B. M. (2002). Impact of differential item functioning on age and gender differences in functional disability. Journal of Gerontology: Social Sciences, 57B, 275–284.CrossRefGoogle Scholar
  30. 30.
    Muthen, B. O., Kao, C., & Burstein, L. (1991). Instructionally sensitive psychometrics: An application of a new IRT-based detection technique to mathematics achievement test items. Journal of Educational Measurement, 28, 1–22.CrossRefGoogle Scholar
  31. 31.
    Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with mantel-haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29(4), 278–295. doi: 10.1177/0146621605275728.CrossRefGoogle Scholar
  32. 32.
    Willse, J. T., & Goodman, J. T. (2008). Comparison of multiple-indicators, multiple-causes- and item response theory-based analyses of subgroup differences. Educational and Psychological Measurement, 68, 587–602.CrossRefGoogle Scholar
  33. 33.
    Mazor, K. M., Hambleton, R. K., & Clauser, B. E. (1998). Multidimensional DIF analyses: The effects of matching on unidimensional subtest scores. Applied Psychological Measurement, 22, 357–367.CrossRefGoogle Scholar
  34. 34.
    Finch, W. H., & French, B. F. (2008). Anomalous type I error rates for identifying one type of differential item functioning in the presence of the other. Educational and Psychological Measurement, 68, 742–759.CrossRefGoogle Scholar
  35. 35.
    Teresi, J. A., Kleinman, M., & Ocepek-Welikson, K. (2000). Modern psychometric methods for detection of differential item functioning: Application to cognitive assessment measures. Statistics in Medicine, 19, 1651–1683.PubMedCrossRefGoogle Scholar
  36. 36.
    Perneger, T. V. (1998). What’s wrong with Bonferroni adjustments. BMJ, 316, 1236–1238.PubMedCrossRefGoogle Scholar
  37. 37.
    Cole, S. R., Kawachi, I., Maller, S. J., & Berkman, L. F. (2000). Test of item-response bias in the CES-D scale: Experience from the new haven EPESE study. Journal of Clinical Epidemiology, 53, 285–289.PubMedCrossRefGoogle Scholar
  38. 38.
    Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data. Hoboken, NJ: Wiley.Google Scholar
  39. 39.
    Marsh, H. W., Hau, K.-, & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizing hu and bentler’s (1999) findings. Structural Equation Modeling: A Multidisciplinary Journal, 11, 320–341.CrossRefGoogle Scholar
  40. 40.
    Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.CrossRefGoogle Scholar
  41. 41.
    Feinian, C., Curran, P. J., Bollen, K. A., Kirby, J., & Paxton, P. (2008). An empirical evaluation of the use of fixed cutoff points in RMSEA test statistic in structural equation models. Sociological Methods Research, 36, 462–494.CrossRefGoogle Scholar
  42. 42.
    Muthén, B. O. (1998–2004). Mplus technical appendices. Los Angeles, CA: Muthén & Muthén.Google Scholar
  43. 43.
    Yu, C. Y. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Unpublished doctoral dissertation.Google Scholar
  44. 44.
    Yang, F. M., Tommet, D., & Jones, R. N. (2009). Disparities in self-reported geriatric depressive symptoms due to sociodemographic differences: An extension of the bi-factor item response theory model for use in differential item functioning. Journal of Psychiatric Research, 43, 1025–1035.PubMedCrossRefGoogle Scholar
  45. 45.
    Chen, F. F., West, S. G., & Sousa, K. H. (2006). A comparison of bifactor and second-order models of quality of life. Multivariate Behavioral Research, 41, 189–225.CrossRefGoogle Scholar
  46. 46.
    Reise, S. P., Morizot, J., & Hays, R. D. (2007). The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Quality of Life Research, 16, 19–31.PubMedCrossRefGoogle Scholar
  47. 47.
    Johnson, J. E. (1973). Effects of accurate expectations about sensations on the sensory and distress components of pain. Journal of Personality and Social Psychology, 27, 261–267.PubMedCrossRefGoogle Scholar
  48. 48.
    Leventhal, H., Halm, E., Horowitz, C., Leventhal, E., & Ozakinci, G. (2004). Living with chronic illness: A contextualized, self-regulation approach. In S. Sutton, A. Baum, & M. Johnston (Eds.), The sage handbook of health psychology (pp. 197–240). London: Sage.Google Scholar
  49. 49.
    Johnson, J. E., Leventhal, H., & Dabbs, J. (1971). J. M. contribution of emotional and instrumental response processes in adaptation to surgery. Journal of Personality and Social Psychology, 20, 55–64.PubMedCrossRefGoogle Scholar
  50. 50.
    Johnson, J. E., Morrissey, J. F., & Leventhal, H. (1973). Psychology preparation for an endoscopic examination. Gastrointestinal Endoscopy, 19, 180–182.PubMedCrossRefGoogle Scholar
  51. 51.
    Johnson, J. E., & Leventhal, H. (1974). The effects of accurate expectations and behavioral instructions on reactions during a noxious medical examination. Journal of Personality and Social Psychology, 29, 710–718.PubMedCrossRefGoogle Scholar
  52. 52.
    McBride, M. (2001). Relative-income effects on subjective well-being in the cross-section. Journal of Economic Behavior & Organization, 45, 251–278.CrossRefGoogle Scholar
  53. 53.
    Gregorich, S. E. (2006). Do self-report instruments allow meaningful comparisons across diverse population groups? Testing measurement invariance using the confirmatory factor analysis framework. Medical Care, 44, S78–S94.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Tamer F. Desouky
    • 1
  • Pablo A. Mora
    • 1
  • Elizabeth A. Howell
    • 2
  1. 1.Department of PsychologyThe University of Texas at ArlingtonArlingtonUSA
  2. 2.Department of Obstetrics and GynecologyMount Sinai Medical CenterNew YorkUSA

Personalised recommendations