Skip to main content

Using the Testlet Response Model as a Shortcut to Multidimensional Item Response Theory Subscore Computation

  • Conference paper
  • First Online:
Book cover New Developments in Quantitative Psychology

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 66))

Abstract

In the past decade, computational advances have made application of multidimensional item response theory (MIRT) models a somewhat practical endeavor. Within the context of “diagnostic assessment,” MIRT provides a way to compute subscores on a test as estimates of the latent variables, and its use for that purpose has been proposed. However, the model dimensionality desired for subscores may be too high for straightforward MIRT computation, even with contemporary algorithms and computers. The fact that the testlet response model (TRM) is a computationally convenient reparameterization of a higher-order factor model can be used as the basis of a “shortcut” to MIRT subscore estimation in some cases: Because the TRM is a constrained bifactor model, its item parameters can be estimated using two-dimensional numerical integration, regardless of the total dimensionality of the complete model. After those item parameters are estimated, they can be converted to become the parameters of the corresponding higher-order factor model, and that model can, in turn, be used to compute MIRT subscores. This presentation illustrates the process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.

    Article  MathSciNet  MATH  Google Scholar 

  • Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional IRT models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395–414.

    Article  MathSciNet  Google Scholar 

  • Bradlow, E., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.

    Article  MATH  Google Scholar 

  • Cai, L. (2010a). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75, 33–57.

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, L. (2010b). Metropolis-Hastings Robbins-Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307–335.

    Article  Google Scholar 

  • Cai, L. (2010c). A two-tier full-information item factor analysis model with applications. Psychometrika, 75, 581–612.

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, L., Thissen, D., & du Toit, S. (2011). IRTPRO version 2: Flexible, multidimensional, multiple categorical IRT modeling [Computer software manual]. Chicago, IL.

    Google Scholar 

  • Cai, L., Yang, J., & Hansen, M. (2011). Generalized full-information item bifactor analysis. Psychological Methods, 16, 221–248.

    Article  Google Scholar 

  • de la Torre, J. (2009). Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables. Applied Psychological Measurement, 33, 465–485.

    Article  MathSciNet  Google Scholar 

  • de la Torre, J., & Hong, Y. (2009). Parameter estimation with small sample size: A higher-order IRT approach. Applied Psychological Measurement, 34, 267–285.

    Article  Google Scholar 

  • de la Torre, J., & Patz, R. J. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295–311.

    Article  Google Scholar 

  • de la Torre, J., & Song, H. (2009). Simultaneous estimation of overall and domain abilities: A higher-order IRT model approach. Applied Psychological Measurement, 33, 620–639.

    Article  Google Scholar 

  • Edwards, M. C. (2010). A Markov chain Monte Carlo approach to confirmatory item factor analysi. Psychometrika, 75, 474–497.

    Article  MathSciNet  MATH  Google Scholar 

  • Edwards, M. C., & Vevea, J. L. (2006). An empirical bayes approach to subscore augmentation: How much strength can we borrow? Journal of Educational and Behavioral Statistics, 31, 241–259.

    Article  Google Scholar 

  • Estes, S. (1946). Deviations of Wechsler-Bellevue subtest scores from vocabulary level in superior adults. Journal of Abnormal and Social Psychology, 41, 226–228.

    Article  Google Scholar 

  • Gibbons, R., Bock, R., Hedeker, D., Weiss, D., Segawa, E., Bhaumik, D., et al. (2007). Full-information item bifactor analysis of graded response data. Applied Psychological Measurement, 31, 4–19.

    Article  MathSciNet  Google Scholar 

  • Gibbons, R., & Hedeker, D. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.

    Article  MATH  Google Scholar 

  • Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 22, 204–229.

    Google Scholar 

  • Haberman, S. J., & Sinharay, S. (2010). Reporting of subscores using multidimensional item response theory. Psychometrika, 75, 209–227.

    Article  MathSciNet  MATH  Google Scholar 

  • Holzinger, K. J., & Swineford, F. (1937). The bi-factor method. Psychometrika, 2, 41–54.

    Article  Google Scholar 

  • Irwin, D., Stucky, B. D., Langer, M. M., Thissen, D., DeWitt, E. M., Lai, J. S., et al. (2010). An item response analysis of the Pediatric PROMIS Anxiety and Depressive Symptoms Scales. Quality of Life Research, 19, 595–607.

    Article  Google Scholar 

  • Irwin, D., Stucky, B. D., Langer, M. M., Thissen, D., DeWitt, E. M., Lai, J. S., et al. (2012). PROMIS Pediatric Anger Scale: An item response theory analysis. Quality of Life Research, 21, 697–706.

    Article  Google Scholar 

  • Kelley, T. L. (1927). The interpretation of educational measurements. New York: World Book.

    Google Scholar 

  • Li, Y., Bolt, D. M., & Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 30, 3–21.

    Article  MathSciNet  Google Scholar 

  • Rijmen, F. (2010). Formal relations and an empirical comparison between the bi-factor, the testlet, and a second-order multidimensional IRT model. Journal of Educational Measurement, 47, 361–372.

    Article  Google Scholar 

  • Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.

    MathSciNet  MATH  Google Scholar 

  • Schmid, J., & Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22, 53–61.

    Article  MATH  Google Scholar 

  • Sinharay, S., Haberman, S. J., & Puhan, G. (2008). Subscores based on classical test theory: To report or not to report. Educational Measurement: Issues and Practice, 26, 21–28.

    Article  Google Scholar 

  • Thissen, D., & Steinberg, L. (2010). Using item response theory to disentangle constructs at different levels of generality. In S. Embretson (Ed.), Measuring psychological constructs: Advances in model-based approaches (pp. 123–144). Washington, DC: American Psychological Association.

    Chapter  Google Scholar 

  • Tucker, L. R. (1940). The role of correlated factors in factor analysis. Psychometrika, 5, 141–152.

    Article  MATH  Google Scholar 

  • Tukey, J. W. (1973). Exploratory data analysis as part of a large whole. In Proceedings of the Eighteenth Conference on the Design of Experiments in Army Research, Development and Testing, Part I (pp. 1–10), Durham, NC.

    Google Scholar 

  • Wainer, H., Bradlow, E., & Du, Z. (2000). Testlet response theory: An analog for the 3-PL useful in testlet-based adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–270). Boston: Kluwer Academic.

    Chapter  Google Scholar 

  • Wainer, H., Bradlow, E., & Wang, X. (2007). Testlet response theory and its applications. New York: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Wainer, H., Vevea, J. L., Camacho, F., Reeve, B. B., Rosa, K., Nelson, L., et al. (2001). Augmented scores: “Borrowing strength” to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds.), Test scoring (pp. 343–387). Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • Wang, X., Bradlow, E., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26, 109–128.

    Article  MathSciNet  Google Scholar 

  • Wang, X., Bradlow, E., & Wainer, H. (2005). A user’s guide for SCORIGHT version 3.0. (ETS Technical Report RR-04–49). Princeton: Educational Testing Service.

    Google Scholar 

  • Wechsler, D. (1939). The measurement of adult intelligence. Baltimore: Williams & Witkins.

    Book  Google Scholar 

  • Yao, L. (2010). Reporting valid and reliability overall score and domain scores. Journal of Educational Measurement, 47, 339–360.

    Article  Google Scholar 

  • Yao, L., & Boughton, K. A. (2007). A multidimensional item response modeling approach for improving subscale proficiency estimation and classification. Applied Psychological Measurement, 31, 1–23.

    Article  MathSciNet  Google Scholar 

  • Yao, L., & Boughton, K. A. (2009). Multidimensional linking for tests containing polytomous items. Journal of Educational Measurement, 46, 177–197.

    Article  Google Scholar 

  • Yao, L., & Schwarz, R. D. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests. Applied Psychological Measurement, 30, 469–492.

    Article  MathSciNet  Google Scholar 

  • Yen, W. M. (1987, June). A Bayesian/IRT Index of Objective Performance. Paper presented at the annual meeting of the Psychometric Society, Montreal, Quebec, Canada.

    Google Scholar 

  • Yung, Y. F., McLeod, L. D., & Thissen, D. (1999). The development of hierarchical factor solutions. Psychometrika, 64, 113–128.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Thissen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this paper

Cite this paper

Thissen, D. (2013). Using the Testlet Response Model as a Shortcut to Multidimensional Item Response Theory Subscore Computation. In: Millsap, R.E., van der Ark, L.A., Bolt, D.M., Woods, C.M. (eds) New Developments in Quantitative Psychology. Springer Proceedings in Mathematics & Statistics, vol 66. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9348-8_3

Download citation

Publish with us

Policies and ethics