Using Credible Intervals to Detect Differential Item Functioning in IRT Models

Su, Ya-Hui; Chang, Joyce; Tsai, Henghsiu

doi:10.1007/978-3-319-77249-3_25

Ya-Hui Su⁶,
Joyce Chang⁷ &
Henghsiu Tsai⁸

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 233))

Included in the following conference series:

The Annual Meeting of the Psychometric Society

1495 Accesses
1 Citations

Abstract

Differential item functioning (DIF) occurs when individuals from different groups with the same level of ability have different probabilities of answering an item correctly. In this paper, we develop a Bayesian approach to detect DIF based on the credible intervals within the framework of item response theory models. Our method performed well for both uniform and non-uniform DIF conditions in the two-parameter logistic model. The efficacy of the proposed approach is demonstrated through simulation studies and a real data application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.00; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.
Google Scholar
Camilli, G., & Penfield, D. A. (1997). Variance estimation for differential test functioning based ore Mantel-Haenszel statistics. Journal of Educational Measurement, 34, 123–139.
Article Google Scholar
Chang, Y.-W., Tsai, R.-C., & Hsu, N.-J. (2014). A speeded item response model: Leave the harder till later. Psychometrika, 79, 255–274.
Article MathSciNet Google Scholar
Chang, J., Tsai, H., Su, Y.-H., & Lin, E. M. H. (2016). A three-parameter speeded item response model: Estimation and application. In L. A. van der Ark, D. M. Bolt, W.-C. Wang, J. A. Douglas, & M. Wiberg (Eds.), Quantitative psychology research (Vol. 167) (pp. 27–38). Switzerland: Springer.
Chapter Google Scholar
Dahiru, T. (2008). P-value, a true test of statistical significance? A cautionary note. Annals of Ibadan Postgraduate Medicine, 6, 21–26.
Google Scholar
Dancer, L. S., Anderson, A. J., & Derlin, R. L. (1994). Use of log-linear models for assessing differential item functioning in a measure of psychological functioning. Journal of Consulting and Clinical Psychology, 62, 710–717.
Article Google Scholar
de Ayala, R. J. (2009). The theory and practice of item response theory. New York: Guilford Press.
Google Scholar
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: L. Erlbaum Associates.
Google Scholar
Garthwaite, P., Jolliffe, I., & Jones, B. (2002). Statistical inference. Oxford: Oxford University Press.
MATH Google Scholar
Glas, C. A. W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667.
MathSciNet MATH Google Scholar
Hambleton, R. K., & Rogers, H. J. (1989). Detecting potentially biased test items: Comparison of IRT area and Mantel-Haenszel methods. Applied Measurement in Education, 2, 313–334.
Article Google Scholar
Holland, P. W., & Thayer, D. T. (1988). Differential item functioning and the Mantel-Haenszel procedure. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 129–145). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681–697.
Article MathSciNet Google Scholar
Kok, F. G., Mellenbergh, G. J., & van der Flier, H. (1985). Detecting experimentally induced item bias using the iterative logit method. Journal of Educational Measurement, 22, 295–303.
Article Google Scholar
Li, Z. (2015). A power formula for the Mantel-Haenszel test for differential item functioning. Applied Psychological Measurement, 39, 373–388.
Article Google Scholar
Riley, B. B., & Carle, A. C. (2012). Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: A preliminary monte carlo study. BMC Medical Research Methodology, 12, 124.
Article Google Scholar
Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68, 456–477.
Article Google Scholar
Wang, M., & Woods, C. M. (2017). Anchor selection using the Wald test anchor-all-test-all procedure. Applied Psychological Measurement, 41, 17–29.
Article Google Scholar
Wang, W.-C. (2004). Rasch measurement theory and application in education and psychology. Journal of Education and Psychology, 27, 637–694. (in Chinese).
Google Scholar

Download references

Acknowledgements

The research was supported by Academia Sinica and the Ministry of Science and Technology of the Republic of China under grant number MOST 104-2118-M-001-008-MY2. The authors would like to thank Rianne Janssen, the Co-Editor, Dr. Yu-Wei Chang and Ms. Yi-Jhen Wu for their helpful comments and suggestions.

Author information

Authors and Affiliations

Department of Psychology, National Chung Cheng University, 168 University Road, Section 1, Min-Hsiung, Chia-Yi, 62102, Taiwan
Ya-Hui Su
Department of Economics, University of Texas at Austin, 2225 Speedway, Austin, TX, 78712, USA
Joyce Chang
Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang District, Taipei, 11529, Taiwan
Henghsiu Tsai

Authors

Ya-Hui Su
View author publications
You can also search for this author in PubMed Google Scholar
Joyce Chang
View author publications
You can also search for this author in PubMed Google Scholar
Henghsiu Tsai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henghsiu Tsai .

Editor information

Editors and Affiliations

Umeå School of Business, Economics and Statistics, Umeå University, Umeå, Sweden
Marie Wiberg
Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA
Steven Culpepper
Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium
Rianne Janssen
Faculty of Mathematics, Pontificia Universidad Católica de Chile, Santiago, Chile
Jorge González
Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
Dylan Molenaar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, YH., Chang, J., Tsai, H. (2018). Using Credible Intervals to Detect Differential Item Functioning in IRT Models. In: Wiberg, M., Culpepper, S., Janssen, R., González, J., Molenaar, D. (eds) Quantitative Psychology. IMPS 2017. Springer Proceedings in Mathematics & Statistics, vol 233. Springer, Cham. https://doi.org/10.1007/978-3-319-77249-3_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-77249-3_25
Published: 21 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77248-6
Online ISBN: 978-3-319-77249-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics