Abstract
In the past decade, computational advances have made application of multidimensional item response theory (MIRT) models a somewhat practical endeavor. Within the context of “diagnostic assessment,” MIRT provides a way to compute subscores on a test as estimates of the latent variables, and its use for that purpose has been proposed. However, the model dimensionality desired for subscores may be too high for straightforward MIRT computation, even with contemporary algorithms and computers. The fact that the testlet response model (TRM) is a computationally convenient reparameterization of a higher-order factor model can be used as the basis of a “shortcut” to MIRT subscore estimation in some cases: Because the TRM is a constrained bifactor model, its item parameters can be estimated using two-dimensional numerical integration, regardless of the total dimensionality of the complete model. After those item parameters are estimated, they can be converted to become the parameters of the corresponding higher-order factor model, and that model can, in turn, be used to compute MIRT subscores. This presentation illustrates the process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.
Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional IRT models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.
Bradlow, E., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.
Cai, L. (2010a). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75, 33–57.
Cai, L. (2010b). Metropolis-Hastings Robbins-Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307–335.
Cai, L. (2010c). A two-tier full-information item factor analysis model with applications. Psychometrika, 75, 581–612.
Cai, L., Thissen, D., & du Toit, S. (2011). IRTPRO version 2: Flexible, multidimensional, multiple categorical IRT modeling [Computer software manual]. Chicago, IL.
Cai, L., Yang, J., & Hansen, M. (2011). Generalized full-information item bifactor analysis. Psychological Methods, 16, 221–248.
de la Torre, J. (2009). Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables. Applied Psychological Measurement, 33, 465–485.
de la Torre, J., & Hong, Y. (2009). Parameter estimation with small sample size: A higher-order IRT approach. Applied Psychological Measurement, 34, 267–285.
de la Torre, J., & Patz, R. J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295–311.
de la Torre, J., & Song, H. (2009). Simultaneous estimation of overall and domain abilities: A higher-order IRT model approach. Applied Psychological Measurement, 33, 620–639.
Edwards, M. C. (2010). A Markov chain Monte Carlo approach to confirmatory item factor analysi. Psychometrika, 75, 474–497.
Edwards, M. C., & Vevea, J. L. (2006). An empirical bayes approach to subscore augmentation: How much strength can we borrow? Journal of Educational and Behavioral Statistics, 31, 241–259.
Estes, S. (1946). Deviations of Wechsler-Bellevue subtest scores from vocabulary level in superior adults. Journal of Abnormal and Social Psychology, 41, 226–228.
Gibbons, R., Bock, R., Hedeker, D., Weiss, D., Segawa, E., Bhaumik, D., et al. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 4–19.
Gibbons, R., & Hedeker, D. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.
Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 22, 204–229.
Haberman, S. J., & Sinharay, S. (2010). Reporting of subscores using multidimensional item response theory. Psychometrika, 75, 209–227.
Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.
Irwin, D., Stucky, B. D., Langer, M. M., Thissen, D., DeWitt, E. M., Lai, J. S., et al. (2010). An item response analysis of the Pediatric PROMIS Anxiety and Depressive Symptoms Scales. Quality of Life Research, 19, 595–607.
Irwin, D., Stucky, B. D., Langer, M. M., Thissen, D., DeWitt, E. M., Lai, J. S., et al. (2012). PROMIS Pediatric Anger Scale: An item response theory analysis. Quality of Life Research, 21, 697–706.
Kelley, T. L. (1927). The interpretation of educational measurements. New York: World Book.
Li, Y., Bolt, D. M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 30, 3–21.
Rijmen, F. (2010). Formal relations and an empirical comparison between the bi-factor, the testlet, and a second-order multidimensional IRT model. Journal of Educational Measurement, 47, 361–372.
Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.
Schmid, J., & Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53–61.
Sinharay, S., Haberman, S. J., & Puhan, G. (2008). Subscores based on classical test theory: To report or not to report. Educational Measurement: Issues and Practice, 26, 21–28.
Thissen, D., & Steinberg, L. (2010). Using item response theory to disentangle constructs at different levels of generality. In S. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches (pp. 123–144). Washington, DC: American Psychological Association.
Tucker, L. R. (1940). The role of correlated factors in factor analysis. Psychometrika, 5, 141–152.
Tukey, J. W. (1973). Exploratory data analysis as part of a large whole. In Proceedings of the Eighteenth Conference on the Design of Experiments in Army Research, Development and Testing, Part I (pp. 1–10), Durham, NC.
Wainer, H., Bradlow, E., & Du, Z. (2000). Testlet response theory: An analog for the 3-PL useful in testlet-based adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–270). Boston: Kluwer Academic.
Wainer, H., Bradlow, E., & Wang, X. (2007). Testlet response theory and its applications. New York: Cambridge University Press.
Wainer, H., Vevea, J. L., Camacho, F., Reeve, B. B., Rosa, K., Nelson, L., et al. (2001). Augmented scores: “Borrowing strength” to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 343–387). Hillsdale: Lawrence Erlbaum Associates.
Wang, X., Bradlow, E., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26, 109–128.
Wang, X., Bradlow, E., & Wainer, H. (2005). A user’s guide for SCORIGHT version 3.0. (ETS Technical Report RR-04–49). Princeton: Educational Testing Service.
Wechsler, D. (1939). The measurement of adult intelligence. Baltimore: Williams & Witkins.
Yao, L. (2010). Reporting valid and reliability overall score and domain scores. Journal of Educational Measurement, 47, 339–360.
Yao, L., & Boughton, K. A. (2007). A multidimensional item response modeling approach for improving subscale proficiency estimation and classification. Applied Psychological Measurement, 31, 1–23.
Yao, L., & Boughton, K. A. (2009). Multidimensional linking for tests containing polytomous items. Journal of Educational Measurement, 46, 177–197.
Yao, L., & Schwarz, R. D. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests. Applied Psychological Measurement, 30, 469–492.
Yen, W. M. (1987, June). A Bayesian/IRT Index of Objective Performance. Paper presented at the annual meeting of the Psychometric Society, Montreal, Quebec, Canada.
Yung, Y. F., McLeod, L. D., & Thissen, D. (1999). The development of hierarchical factor solutions. Psychometrika, 64, 113–128.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Thissen, D. (2013). Using the Testlet Response Model as a Shortcut to Multidimensional Item Response Theory Subscore Computation. In: Millsap, R.E., van der Ark, L.A., Bolt, D.M., Woods, C.M. (eds) New Developments in Quantitative Psychology. Springer Proceedings in Mathematics & Statistics, vol 66. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9348-8_3
Download citation
DOI: https://doi.org/10.1007/978-1-4614-9348-8_3
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-9347-1
Online ISBN: 978-1-4614-9348-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)