A Comparison of Differential Item Functioning (DIF) Detection for Dichotomously Scored Items Using IRTPRO, BILOG-MG, and IRTLRDIF

  • Mei Ling OngEmail author
  • Seock-Ho Kim
  • Allan Cohen
  • Stephen Cramer
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 140)


This study was designed to provide an empirical comparison of three IRT calibration programs, IRTPRO, BILOG-MG, and IRTLRDIF, all of which can be used for detecting differential item functioning (DIF). The three programs were compared for each of three dichotomous IRT models, the one-parameter logistic, the two-parameter logistic, and the three-parameter logistic models. Results from each of these programs were examined using data from a test designed to predict high school graduation test results in a large Southeastern US state. Results suggested that all three programs detected DIF differently.


Differential item functioning IRTPRO BILOG-MG IRTLRDIF IRT 1PL 2PL 3PL 


  1. Baker, F. B., & Kim, S.-H. (2004). Item response theory—Parameter estimation techniques (2nd ed.). Boca Raton: Taylor & Francis.zbMATHGoogle Scholar
  2. Basokcu, T. O., & Ogretmen, T. (2014). Comparison of parametric item response techniques in determining differential item functioning in polytomous scale. American Journal of Theoretical and Applied Statistics, 3, 31–38.CrossRefGoogle Scholar
  3. Cai, L., Thissen, D., & du Toit, S. (2011). IRTPRO 2.1 [Computer software]. Lincolnwood: Scientific Software International.Google Scholar
  4. Coffman, D. L., & Belue, R. (2009). Disparities in sense of community—True race differences or differential item functioning? Journal of Community Psychology, 37, 547–558.CrossRefGoogle Scholar
  5. Georgia Center for Assessment. (2007–2012). The Georgia high school graduation predictor test. Athens, GA: Author.Google Scholar
  6. Georgia Department of Education. (2010). Test content descriptions based on the Georgia performance standards social studies. Accessed 15 Nov 2014.
  7. Hambleton, R. K. (2006). Good practices for identifying differential item functioning. Medical Care, 44, S182–S188.CrossRefGoogle Scholar
  8. Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
  9. Kline, T. J. B. (2004). Gender and language differences on the test of workplace essential skills—Using overall mean scores and item-level differential item functioning analyses. Educational and Psychological Measurement, 64, 549–559.MathSciNetCrossRefGoogle Scholar
  10. Logan, J. R., Minca, E., & Adar, S. (2012). The geography of inequality—Why separate means unequal in American public schools. Sociology of Education, 85, 287–301.CrossRefGoogle Scholar
  11. Lord, F. M. (1977). A broad-range tailored test of verbal ability. Applied Psychological Measurement, 1, 95–100.CrossRefGoogle Scholar
  12. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale: Lawrence Erlbaum Associates.Google Scholar
  13. McNulty, T. L., & Bellair, P. E. (2003). Explaining racial and ethnic differences in serious adolescent violent behavior. Criminology, 41, 709–748.CrossRefGoogle Scholar
  14. Paek, I., & Han, K. T. (2013). IRTPRO 2.1 for windows (item response theory for patient-reported outcomes). Applied Psychological Measurement, 37, 242–252.CrossRefGoogle Scholar
  15. Samejima, F. (1997). Graded response model. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 85–100). New York: Springer.CrossRefGoogle Scholar
  16. Steinberg, L. (1994). Context and serial-order effects in personality measurement—Limits on the generality of measuring changes the measure. Journal of Personality and Social Psychology, 66, 341–349.CrossRefGoogle Scholar
  17. Thissen, D. (2001). IRTLRDIF v2.0b—Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning [Computer software documentation]. Chapel Hill: L. L. Thurstone Psychometric Laboratory, University of North Carolina.Google Scholar
  18. Thissen, D., Steinverg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response model. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67–114). Hillsdale: Lawrence Erlbaum Associates.Google Scholar
  19. Van der Linden, W. J., & Hambleton, R. K. (1997). Item response theory—Brief history, common models, and extensions. In W. J. van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 1–28). New York: Springer.CrossRefGoogle Scholar
  20. Wainer, H., Sireci, S. G., & Thissen, D. (1991). Differential testlet functioning—Definitions and detection. Journal of Educational Measurement, 28, 197–219.CrossRefGoogle Scholar
  21. Wang, X.-B., Wainer, H., & Thissen, D. (1995). On the viability of some untestable assumptions in equating exams that allow examinee choice. Applied Measurement in Education, 8, 211–225.CrossRefGoogle Scholar
  22. Woods, C. M. (2009). Empirical selection of anchors for tests of differential item functioning. Applied Psychological Measurement, 33, 42–57.MathSciNetCrossRefGoogle Scholar
  23. Woods, C. M., Cai, L., & Wang, M. (2013). The Langer-improved Wald test for DIF testing with multiple groups—Evaluation and comparison to two-group IRT. Educational and Psychological Measurement, 73, 532–547.CrossRefGoogle Scholar
  24. Zimowski, M. F., Muraki, E., Mislevy, R. J., & Bock, R. D. (2003). BILOG-MG [Computer software]. Lincolnwood: Scientific Software International.Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Mei Ling Ong
    • 1
    Email author
  • Seock-Ho Kim
    • 2
  • Allan Cohen
    • 3
  • Stephen Cramer
    • 4
  1. 1.Department of Education PsychologyUniversity of GeorgiaAthensUSA
  2. 2.Department of Education PsychologyUniversity of GeorgiaAthensUSA
  3. 3.Department of Education PsychologyUniversity of GeorgiaAthensUSA
  4. 4.Department of Education PsychologyUniversity of GeorgiaAthensUSA

Personalised recommendations