Using the Testlet Response Model as a Shortcut to Multidimensional Item Response Theory Subscore Computation

Thissen, David

doi:10.1007/978-1-4614-9348-8_3

David Thissen⁵

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 66))

1928 Accesses
6 Citations

Abstract

In the past decade, computational advances have made application of multidimensional item response theory (MIRT) models a somewhat practical endeavor. Within the context of “diagnostic assessment,” MIRT provides a way to compute subscores on a test as estimates of the latent variables, and its use for that purpose has been proposed. However, the model dimensionality desired for subscores may be too high for straightforward MIRT computation, even with contemporary algorithms and computers. The fact that the testlet response model (TRM) is a computationally convenient reparameterization of a higher-order factor model can be used as the basis of a “shortcut” to MIRT subscore estimation in some cases: Because the TRM is a constrained bifactor model, its item parameters can be estimated using two-dimensional numerical integration, regardless of the total dimensionality of the complete model. After those item parameters are estimated, they can be converted to become the parameters of the corresponding higher-order factor model, and that model can, in turn, be used to compute MIRT subscores. This presentation illustrates the process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.
Article MathSciNet MATH Google Scholar
Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional IRT models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.
Article MathSciNet Google Scholar
Bradlow, E., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.
Article MATH Google Scholar
Cai, L. (2010a). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75, 33–57.
Article MathSciNet MATH Google Scholar
Cai, L. (2010b). Metropolis-Hastings Robbins-Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307–335.
Article Google Scholar
Cai, L. (2010c). A two-tier full-information item factor analysis model with applications. Psychometrika, 75, 581–612.
Article MathSciNet MATH Google Scholar
Cai, L., Thissen, D., & du Toit, S. (2011). IRTPRO version 2: Flexible, multidimensional, multiple categorical IRT modeling [Computer software manual]. Chicago, IL.
Google Scholar
Cai, L., Yang, J., & Hansen, M. (2011). Generalized full-information item bifactor analysis. Psychological Methods, 16, 221–248.
Article Google Scholar
de la Torre, J. (2009). Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables. Applied Psychological Measurement, 33, 465–485.
Article MathSciNet Google Scholar
de la Torre, J., & Hong, Y. (2009). Parameter estimation with small sample size: A higher-order IRT approach. Applied Psychological Measurement, 34, 267–285.
Article Google Scholar
de la Torre, J., & Patz, R. J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295–311.
Article Google Scholar
de la Torre, J., & Song, H. (2009). Simultaneous estimation of overall and domain abilities: A higher-order IRT model approach. Applied Psychological Measurement, 33, 620–639.
Article Google Scholar
Edwards, M. C. (2010). A Markov chain Monte Carlo approach to confirmatory item factor analysi. Psychometrika, 75, 474–497.
Article MathSciNet MATH Google Scholar
Edwards, M. C., & Vevea, J. L. (2006). An empirical bayes approach to subscore augmentation: How much strength can we borrow? Journal of Educational and Behavioral Statistics, 31, 241–259.
Article Google Scholar
Estes, S. (1946). Deviations of Wechsler-Bellevue subtest scores from vocabulary level in superior adults. Journal of Abnormal and Social Psychology, 41, 226–228.
Article Google Scholar
Gibbons, R., Bock, R., Hedeker, D., Weiss, D., Segawa, E., Bhaumik, D., et al. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 4–19.
Article MathSciNet Google Scholar
Gibbons, R., & Hedeker, D. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.
Article MATH Google Scholar
Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 22, 204–229.
Google Scholar
Haberman, S. J., & Sinharay, S. (2010). Reporting of subscores using multidimensional item response theory. Psychometrika, 75, 209–227.
Article MathSciNet MATH Google Scholar
Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.
Article Google Scholar
Irwin, D., Stucky, B. D., Langer, M. M., Thissen, D., DeWitt, E. M., Lai, J. S., et al. (2010). An item response analysis of the Pediatric PROMIS Anxiety and Depressive Symptoms Scales. Quality of Life Research, 19, 595–607.
Article Google Scholar
Irwin, D., Stucky, B. D., Langer, M. M., Thissen, D., DeWitt, E. M., Lai, J. S., et al. (2012). PROMIS Pediatric Anger Scale: An item response theory analysis. Quality of Life Research, 21, 697–706.
Article Google Scholar
Kelley, T. L. (1927). The interpretation of educational measurements. New York: World Book.
Google Scholar
Li, Y., Bolt, D. M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 30, 3–21.
Article MathSciNet Google Scholar
Rijmen, F. (2010). Formal relations and an empirical comparison between the bi-factor, the testlet, and a second-order multidimensional IRT model. Journal of Educational Measurement, 47, 361–372.
Article Google Scholar
Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.
MathSciNet MATH Google Scholar
Schmid, J., & Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53–61.
Article MATH Google Scholar
Sinharay, S., Haberman, S. J., & Puhan, G. (2008). Subscores based on classical test theory: To report or not to report. Educational Measurement: Issues and Practice, 26, 21–28.
Article Google Scholar
Thissen, D., & Steinberg, L. (2010). Using item response theory to disentangle constructs at different levels of generality. In S. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches (pp. 123–144). Washington, DC: American Psychological Association.
Chapter Google Scholar
Tucker, L. R. (1940). The role of correlated factors in factor analysis. Psychometrika, 5, 141–152.
Article MATH Google Scholar
Tukey, J. W. (1973). Exploratory data analysis as part of a large whole. In Proceedings of the Eighteenth Conference on the Design of Experiments in Army Research, Development and Testing, Part I (pp. 1–10), Durham, NC.
Google Scholar
Wainer, H., Bradlow, E., & Du, Z. (2000). Testlet response theory: An analog for the 3-PL useful in testlet-based adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–270). Boston: Kluwer Academic.
Chapter Google Scholar
Wainer, H., Bradlow, E., & Wang, X. (2007). Testlet response theory and its applications. New York: Cambridge University Press.
Book MATH Google Scholar
Wainer, H., Vevea, J. L., Camacho, F., Reeve, B. B., Rosa, K., Nelson, L., et al. (2001). Augmented scores: “Borrowing strength” to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 343–387). Hillsdale: Lawrence Erlbaum Associates.
Google Scholar
Wang, X., Bradlow, E., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26, 109–128.
Article MathSciNet Google Scholar
Wang, X., Bradlow, E., & Wainer, H. (2005). A user’s guide for SCORIGHT version 3.0. (ETS Technical Report RR-04–49). Princeton: Educational Testing Service.
Google Scholar
Wechsler, D. (1939). The measurement of adult intelligence. Baltimore: Williams & Witkins.
Book Google Scholar
Yao, L. (2010). Reporting valid and reliability overall score and domain scores. Journal of Educational Measurement, 47, 339–360.
Article Google Scholar
Yao, L., & Boughton, K. A. (2007). A multidimensional item response modeling approach for improving subscale proficiency estimation and classification. Applied Psychological Measurement, 31, 1–23.
Article MathSciNet Google Scholar
Yao, L., & Boughton, K. A. (2009). Multidimensional linking for tests containing polytomous items. Journal of Educational Measurement, 46, 177–197.
Article Google Scholar
Yao, L., & Schwarz, R. D. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests. Applied Psychological Measurement, 30, 469–492.
Article MathSciNet Google Scholar
Yen, W. M. (1987, June). A Bayesian/IRT Index of Objective Performance. Paper presented at the annual meeting of the Psychometric Society, Montreal, Quebec, Canada.
Google Scholar
Yung, Y. F., McLeod, L. D., & Thissen, D. (1999). The development of hierarchical factor solutions. Psychometrika, 64, 113–128.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychology, The University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
David Thissen

Authors

David Thissen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Thissen .

Editor information

Editors and Affiliations

Department of Psychology, Arizona State University, Tempe, AZ, USA
Roger E. Millsap
Department of Methodology and Statistics, Tilburg University, Tilburg, The Netherlands
L. Andries van der Ark
Department of Educational Psychology, University of Wisconsin, Madison, WI, USA
Daniel M. Bolt
Department of Psychology, University of Kansas, Lawrence, KS, USA
Carol M. Woods

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thissen, D. (2013). Using the Testlet Response Model as a Shortcut to Multidimensional Item Response Theory Subscore Computation. In: Millsap, R.E., van der Ark, L.A., Bolt, D.M., Woods, C.M. (eds) New Developments in Quantitative Psychology. Springer Proceedings in Mathematics & Statistics, vol 66. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9348-8_3

Download citation

DOI: https://doi.org/10.1007/978-1-4614-9348-8_3
Published: 13 January 2014
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-9347-1
Online ISBN: 978-1-4614-9348-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics