Robust Measurement via A Fused Latent and Graphical Item Response Theory Model
Item response theory (IRT) plays an important role in psychological and educational measurement. Unlike the classical testing theory, IRT models aggregate the item level information, yielding more accurate measurements. Most IRT models assume local independence, an assumption not likely to be satisfied in practice, especially when the number of items is large. Results in the literature and simulation studies in this paper reveal that misspecifying the local independence assumption may result in inaccurate measurements and differential item functioning. To provide more robust measurements, we propose an integrated approach by adding a graphical component to a multidimensional IRT model that can offset the effect of unknown local dependence. The new model contains a confirmatory latent variable component, which measures the targeted latent traits, and a graphical component, which captures the local dependence. An efficient proximal algorithm is proposed for the parameter estimation and structure learning of the local dependence. This approach can substantially improve the measurement, given no prior information on the local dependence structure. The model can be applied to measure both a unidimensional latent trait and multidimensional latent traits.
Keywordsitem response theory local dependence robust measurement differential item functioning graphical model Ising model pseudo-likelihood regularized estimator Eysenck personality questionnaire-revised
This research was funded by NSF grant DMS-1712657, NSF grant SES-1323977, NSF grant IIS-1633360, Army Research Office grant W911NF-15-1-0159, and NIH grant R01GM047845.
- Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society Series B (Methodological), 36, 192–236.Google Scholar
- Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Reading, MA: Addison-Wesley.Google Scholar
- Chen, Y. (2016). Latent variable modeling and statistical learning. Ph.D. thesis, Columbia University. Available at http://academiccommons.columbia.edu/catalog/ac:198122.
- Chen, Y., Li, X., Liu, J., & Ying, Z. (2016) A fused latent and graphical model for multivariate binary data. Available at arXiv:1606.08925v1.pdf. ArXiv preprint.
- Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates Publishers.Google Scholar
- Epskamp, S., Maris, G. K., Waldorp, L. J., & Borsboom, D. (2016). Network psychometrics. arXiv preprint arXiv:1609.02818.
- Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in Neural Information Processing Systems (pp 604–612).Google Scholar
- Holland, P. W., & Wainer, H. (2012). Differential item functioning. New York, NY: Routledge.Google Scholar
- Ising, E. (1925). Beitrag zur theorie des ferromagnetismus. Zeitschrift für Physik A Hadrons and Nuclei, 31, 253–258.Google Scholar
- Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Cambridge, MA: MIT press.Google Scholar
- Kruis, J., & Maris, G. (2016). Three representations of the Ising model. Scientific Reports, 6(34175), 1–11.Google Scholar
- Laird, N. M. (1991). Topics in likelihood-based methods for longitudinal data analysis. Statistica Sinica, 1, 33–50.Google Scholar
- Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.Google Scholar
- Marsman, M., Maris, G., Bechger, T., & Glas, C. (2015). Bayesian inference for low-rank Ising networks. Scientific Reports, 5(9050), 1–7.Google Scholar
- McKinley, R. L., & Reckase, M. D. (1982). The use of the general Rasch model with multidimensional item response data. Iowa City, IA: American College Testing.Google Scholar
- Rasch, G. (1960). Probabilistic models for some intelligence and achievement tests. Copenhagen: Danish Institute for Educational Research.Google Scholar
- van Borkulo, C. D., Borsboom, D., Epskamp, S., Blanken, T. F., Boschloo, L., Schoevers, R. A., et al. (2014). A new method for constructing networks from binary data. Scientific Reports, 4(5918), 1–10.Google Scholar