Advertisement

Constrained Adaptive Testing with Shadow Tests

  • Wim J. van der Linden
Chapter
Part of the Statistics for Social and Behavioral Sciences book series (SSBS)

Abstract

The intuitive principle underlying adaptive testing is that a test has better measurement properties if the difficulties of its items match the ability of the examinee. Items that are too easy or difficult have predictable responses and cannot provide much information about the ability of the examinee. The first to formalize this principle was Birnbaum (1968). The information measure he used was Fisher’s well-known information in the sample. For dichotomous response models, the measure is defined as
$$I(\theta) = \sum\limits^n_{i=1} I_i(\theta)= \sum\limits^n_{i=1}\frac{(P^\prime(\theta))^2}{P(\theta)[1 - P(\theta)]},$$
(2.1)
where P i (θ) is the probability of a correct response to item i = 1, …, n for an examinee with ability θ, I i (θ) is the information in the examinee’s response to item i, and I(θ) is the information in his or her joint responses to the test.

Keywords

Item Pool Computerize Adaptive Testing Item Selection Integer Program Model Adaptive Testing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adema, J. J. (1990). The construction of customized two-stage tests. Journal of Educational Measurement, 27, 241–253.CrossRefGoogle Scholar
  2. Armstrong, R. D. & Jones, D. H. (1992). Polynomial algorithms for item matching. Applied Psychological Measurement, 16, 271–288.CrossRefGoogle Scholar
  3. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick, Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.Google Scholar
  4. Chang, H.-H. & van der Linden, W. J. (2003). Optimal stratification of item pools in alpha-stratified adaptive testing. Applied Psychological Measurement, 27, 262–274.CrossRefMathSciNetGoogle Scholar
  5. Chang, H. & Ying, Z. (1999). α-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211–222.CrossRefGoogle Scholar
  6. Chang, H.-H. & Ying, Z. (2009). Nonlinear sequential designs for logistic item response models with applications to computerized adaptive tests. Annals of Statistics, 37, 1466–1488.MATHCrossRefMathSciNetGoogle Scholar
  7. Cheng, Y. & Chang, H.-H. (2009). The maximum priority index method for severely constrained item selection in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 62, 369–383.CrossRefGoogle Scholar
  8. Cheng, Y., Chang, H.-H. & Yi, Q. (2007). Two-phase item selection procedure for flexible content balancing in CAT. Applied Psychological Measurement, 31, 467–482.CrossRefMathSciNetGoogle Scholar
  9. Cordova, M. J. (1997). Optimization methods in computerized adaptive testing. Unpublished doctoral dissertation, Rutgers University, New Brunswick, NJ.Google Scholar
  10. Ferguson, T. S. (1996). A course in large-sample theory. London: Chapman & Hall.MATHGoogle Scholar
  11. Glas, C. A. W., Wainer, H. & Bradlow, E. T. (2000). MML and EAP estimation in testlet-based adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds), Computerized adaptive testing: Theory and practice (pp. 271–287). Boston: Kluwer-Nijhof Publishing.Google Scholar
  12. Hetter, R. D. & Sympson, J. B. (1997). Item exposure in CAT-ASVAB. In W. A. Sands, B. K. Waters & J. R. McBride (Eds.),Computerized adaptive testing: From inquiry to operation (pp. 141–144). Washington, DC: American Psychological Association.CrossRefGoogle Scholar
  13. ILOG, Inc. (2003). CPLEX 9.0 [Computer Program and Manual]. Incline Village, NV: Author.Google Scholar
  14. Kingsbury, G. G. & Zara, A. R. (1991). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2, 359–375.CrossRefGoogle Scholar
  15. Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.Google Scholar
  16. Luecht, R. D. (1988). Computer-assisted test assembly using optimization heuristics. Applied Psychological Measurement, 22, 224–236.CrossRefGoogle Scholar
  17. Luecht, R. M. & Nungester, R. J. (1998). Some practical examples of computer-adaptive sequential testing. Journal of Educational Measurement, 35, 229–249.CrossRefGoogle Scholar
  18. Segall, D. O. (1997). Equating the CAT-ASVAB. In W. A. Sands, B. K. Waters & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 81–198). Washington, DC: American Psychological Association.Google Scholar
  19. Stocking, M. L. & Lewis, C. (1998). Controlling item exposure conditional on ability in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 23, 57–75.Google Scholar
  20. Stocking, M. L. & Lewis, C. (2000). Methods of controlling the exposure of items in CAT. In W. J. van der Linden and C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 163–182). Boston: Kluwer-Nijhof Publishing.Google Scholar
  21. Swanson, L. & Stocking, M. L. (1993). A model and heuristic for solving very large item selection problems. Applied Psychological Measurement, 17, 151–166.CrossRefGoogle Scholar
  22. Sympson, J. B. & Hetter, R. D. (1985, October). Controlling item-exposure rates in computerized adaptive testing. Proceedings of the 27th Annual Meeting of the Military Testing Association (pp. 973–977). San Diego: Navy Personnel Research and Development Center.Google Scholar
  23. van der Linden, W. J. (1999). A procedure for empirical initialization of the trait estimator in adaptive testing. Applied Psychological Measurement, 23, 21–29.CrossRefGoogle Scholar
  24. van der Linden, W. J. (2000). Optimal assembly of tests with item sets. Applied Psychological Measurement, 24, 225–240.CrossRefGoogle Scholar
  25. van der Linden, W. J. (2001a). Adaptive testing with equated number-correct scoring. Applied Psychological Measurement, 24, 343–355.CrossRefGoogle Scholar
  26. van der Linden, W. J. (2001b). On complexity in computer-based testing. In G. N. Mills, M. Potenza, J. J. Fremer & W. Ward (Eds.), Computer-based testing: Building the foundation for future assessments (pp. 89–102). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar
  27. van der Linden, W. J. (2005a). A comparison of item-selection methods for adaptive tests with content constraints. Journal of Educational Measurement, 42, 283–302.CrossRefGoogle Scholar
  28. van der Linden, W. J. (2005b). Linear models for optimal test design. New York: Springer-Verlag.MATHGoogle Scholar
  29. van der Linden, W. J. (2006). A lognormal model for response times on test items. Journal of Educational and Behavioral Statistics, 31, 181–204.CrossRefGoogle Scholar
  30. van der Linden, W. J. (2009a). Predictive control of speededness in adaptive testing. Applied Psychological Measurement, 33, 25–41.CrossRefGoogle Scholar
  31. van der Linden, W. J. (2009b). Local observed-score equating. In A. A. von Davier (Ed.), Statistical models for equating, scaling, and linking. New York: Springer-Verlag. In press.Google Scholar
  32. van der Linden, W. J. & Adema, J. J. (1998). Simultaneous assembly of multiple test forms. Journal of Educational Measurement, 35, 185–198 [Addendum in Vol. 36, 90–91].CrossRefGoogle Scholar
  33. van der Linden, W. J. & Chang, H.-H. (2003). Implementing content constraints in alpha-stratified adaptive testing using a shadow test approach. Applied Psychological Measurement, 27, 107–120.CrossRefMathSciNetGoogle Scholar
  34. van der Linden, W. J. & Luecht, R. M. (1998). Observed equating as a test assembly problem. Psychometrika, 62, 401–418.CrossRefGoogle Scholar
  35. van der Linden, W. J. & Reese, L. M. (1998). A model for optimal constrained adaptive testing. Applied Psychological Measurement, 22, 259–270.CrossRefGoogle Scholar
  36. van der Linden, W. J. & Veldkamp, B. P. (2007). Conditional item-exposure control in adaptive testing using item-ineligibility probabilities. Journal of Educational and Behavioral Statistics, 32, 398–418.CrossRefGoogle Scholar
  37. Veldkamp, B. P. & van der Linden, W. J. (2008). A multiple-shadow-test approach to Sympson-Hetter item-exposure control in adaptive testing. International Journal of Testing, 8, 272–289.CrossRefGoogle Scholar
  38. Wainer, H., Bradlow, E. T. & Du, Z. (2000). Testlet response theory: An analog for the 3PL model useful in testlet-based adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds), Computerized adaptive testing: Theory and practice (pp. 245–269). Boston: Kluwer-Nijh of Publishing.Google Scholar
  39. Wainer, H., Bradlow, E. T. & Wang, X. (2007). Testlet response theory and its applications. New York: Cambridge University Press.CrossRefGoogle Scholar
  40. Wainer, H. & Kiely, G. L. (1987). Item clusters in computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24, 185–201.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Wim J. van der Linden
    • 1
  1. 1.CTB/McGraw-HillMontereyUSA

Personalised recommendations