Sequencing an Adaptive Test Battery
Switching a testing program from a linear to an adaptive format increases its efficiency considerably. The gain in efficiency can be used to shorten the length of the test or increase the accuracy of the scores. The gain is especially relevant to testing programs in which a battery of tests has to be administered in a single session but the testing time has to remain feasible. Examples of such programs are diagnostic testing for instructional purposes (e.g., Boughton, Yao, & Lewis 2006; Yao & Boughton, 2007) and large-scale assessments of education. These programs generally involve the reporting of profiles of scores of students, schools, or districts. In order to use such profiles for decision making, each of their individual scores should have satisfactory accuracy. The more advantageous combination of testing time and score accuracy made possible by the use of a battery of adaptive instead of linear tests has been highlighted earlier, for instance, in Brown and Weiss (1977) and Giallucca andWeiss (1979).
KeywordsPosterior Distribution Reading Comprehension Logical Reasoning Test Taker Item Pool
Unable to display preview. Download preview PDF.
- Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick, Statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.Google Scholar
- Boughton, K. A., Yao, L. & Lewis, D. M. (2006, April). Reporting diagnostic subscale scores for tests composed of complex structure. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco.Google Scholar
- Brown, J. M. & Weiss, D. J. (1977). An adaptive testing strategy for achievement test batteries (Research Report 77-6). Minneapolis, MN: University of Minnesota, Psychometric Methods Program.Google Scholar
- Gialluca, K. A. & Weiss, D. J. (1979). Efficiency of an adaptive inter-subtest branching strategy in the measurement of classroom achievement (Research Report 79-6). Minneapolis, MN: University of Minnesota, Psychometric Methods Program.Google Scholar
- Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer-Nijhof Publishing.Google Scholar
- Mulder, J. & van der Linden, W. J. (2009). Multidimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74. In press.Google Scholar
- Thissen, D. & Mislevy, R. J. (2000). Testing algorithms. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevey, L. Steinberg & D. Thissen. Computerized adaptive testing: A primer (pp. 103–135). Mahwah, NJ: Erlbaum.Google Scholar
- Wainer, H., Vevea, J. L., Canachi, F., Reeve III, B. B., Rosa, K., Nelson, L., Swygert, K. A. & Thissen, D. (2001). Augmented scores–“Borrowing strength” to compute scores based upon small numbers of items. In H. Wainer & D. Thissen (Eds.), Test scoring (pp. 343–387). Mahwah, NJ: Erlbaum.Google Scholar