Skip to main content

Using Credible Intervals to Detect Differential Item Functioning in IRT Models

  • Conference paper
  • First Online:
Book cover Quantitative Psychology (IMPS 2017)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 233))

Included in the following conference series:

Abstract

Differential item functioning (DIF) occurs when individuals from different groups with the same level of ability have different probabilities of answering an item correctly. In this paper, we develop a Bayesian approach to detect DIF based on the credible intervals within the framework of item response theory models. Our method performed well for both uniform and non-uniform DIF conditions in the two-parameter logistic model. The efficacy of the proposed approach is demonstrated through simulation studies and a real data application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.

    Google Scholar 

  • Camilli, G., & Penfield, D. A. (1997). Variance estimation for differential test functioning based ore Mantel-Haenszel statistics. Journal of Educational Measurement, 34, 123–139.

    Article  Google Scholar 

  • Chang, Y.-W., Tsai, R.-C., & Hsu, N.-J. (2014). A speeded item response model: Leave the harder till later. Psychometrika, 79, 255–274.

    Article  MathSciNet  Google Scholar 

  • Chang, J., Tsai, H., Su, Y.-H., & Lin, E. M. H. (2016). A three-parameter speeded item response model: Estimation and application. In L. A. van der Ark, D. M. Bolt, W.-C. Wang, J. A. Douglas, & M. Wiberg (Eds.), Quantitative psychology research (Vol. 167) (pp. 27–38). Switzerland: Springer.

    Chapter  Google Scholar 

  • Dahiru, T. (2008). P-value, a true test of statistical significance? A cautionary note. Annals of Ibadan Postgraduate Medicine, 6, 21–26.

    Google Scholar 

  • Dancer, L. S., Anderson, A. J., & Derlin, R. L. (1994). Use of log-linear models for assessing differential item functioning in a measure of psychological functioning. Journal of Consulting and Clinical Psychology, 62, 710–717.

    Article  Google Scholar 

  • de Ayala, R. J. (2009). The theory and practice of item response theory. New York: Guilford Press.

    Google Scholar 

  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: L. Erlbaum Associates.

    Google Scholar 

  • Garthwaite, P., Jolliffe, I., & Jones, B. (2002). Statistical inference. Oxford: Oxford University Press.

    MATH  Google Scholar 

  • Glas, C. A. W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667.

    MathSciNet  MATH  Google Scholar 

  • Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2, 313–334.

    Article  Google Scholar 

  • Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.

    Google Scholar 

  • Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697.

    Article  MathSciNet  Google Scholar 

  • Kok, F. G., Mellenbergh, G. J., & van der Flier, H. (1985). Detecting experimentally induced item bias using the iterative logit method. Journal of Educational Measurement, 22, 295–303.

    Article  Google Scholar 

  • Li, Z. (2015). A power formula for the Mantel-Haenszel test for differential item functioning. Applied Psychological Measurement, 39, 373–388.

    Article  Google Scholar 

  • Riley, B. B., & Carle, A. C. (2012). Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: A preliminary monte carlo study. BMC Medical Research Methodology, 12, 124.

    Article  Google Scholar 

  • Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68, 456–477.

    Article  Google Scholar 

  • Wang, M., & Woods, C. M. (2017). Anchor selection using the Wald test anchor-all-test-all procedure. Applied Psychological Measurement, 41, 17–29.

    Article  Google Scholar 

  • Wang, W.-C. (2004). Rasch measurement theory and application in education and psychology. Journal of Education and Psychology, 27, 637–694. (in Chinese).

    Google Scholar 

Download references

Acknowledgements

The research was supported by Academia Sinica and the Ministry of Science and Technology of the Republic of China under grant number MOST 104-2118-M-001-008-MY2. The authors would like to thank Rianne Janssen, the Co-Editor, Dr. Yu-Wei Chang and Ms. Yi-Jhen Wu for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Henghsiu Tsai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Su, YH., Chang, J., Tsai, H. (2018). Using Credible Intervals to Detect Differential Item Functioning in IRT Models. In: Wiberg, M., Culpepper, S., Janssen, R., González, J., Molenaar, D. (eds) Quantitative Psychology. IMPS 2017. Springer Proceedings in Mathematics & Statistics, vol 233. Springer, Cham. https://doi.org/10.1007/978-3-319-77249-3_25

Download citation

Publish with us

Policies and ethics