Applying item response theory and computer adaptive testing: the challenges for health outcomes assessment
- 486 Downloads
We review the papers presented at the NCI/DIA conference, to identify areas of controversy and uncertainty, and to highlight those aspects of item response theory (IRT) and computer adaptive testing (CAT) that require theoretical or empirical research in order to justify their application to patient reported outcomes (PROs).
IRT and CAT offer exciting potential for the development of a new generation of PRO instruments. However, most of the research into these techniques has been in non-healthcare settings, notably in education. Educational tests are very different from PRO instruments, and consequently problematic issues arise when adapting IRT and CAT to healthcare research.
Clinical scales differ appreciably from educational tests, and symptoms have characteristics distinctly different from examination questions. This affects the transferring of IRT technology. Particular areas of concern when applying IRT to PROs include inadequate software, difficulties in selecting models and communicating results, insufficient testing of local independence and other assumptions, and a need of guidelines for estimating sample size requirements. Similar concerns apply to differential item functioning (DIF), which is an important application of IRT. Multidimensional IRT is likely to be advantageous only for closely related PRO dimensions.
Although IRT and CAT provide appreciable potential benefits, there is a need for circumspection. Not all PRO scales are necessarily appropriate targets for this methodology. Traditional psychometric methods, and especially qualitative methods, continue to have an important role alongside IRT. Research should be funded to address the specific concerns that have been identified.
KeywordsQuality of life Item response theory Patient reported outcomes Health outcomes measurement
- 1.Berkson, J. (1944). Application of the logistic function to bio-assay. Journal of the American Statistical Society, 39, 357–365.Google Scholar
- 2.Rasch, G. (1960). Probabilistic models for some intelligence attainment tests. Copenhagen: Danish Institute for Educational Research.Google Scholar
- 3.Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society series B, 34, 187–220.Google Scholar
- 4.McCullagh, P. (1980). Regression models for ordinal data (with discussion). Journal of the Royal Statistical Society series B, 42, 109–142.Google Scholar
- 5.Van der Linden, W. J., & Hambleton, R. K. (1996). Item response theory: brief history, common models, and extensions. In W. J. Van der Linden & R. K. Hambleton (Eds.), Handbook of modern item response theory (p. 23). New York: Springer.Google Scholar
- 6.Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. (Psychometric Monograph No. 17). Iowa City: Psychometric Society.Google Scholar
- 9.Fayers, P. M, & Hand, D. J. (2002). Causal variables, indicator variables and measurement scales: An example from quality of life. Journal of the Royal Statistical Society series A, 165, 233–261.Google Scholar
- 10.Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (p. 123). Mahwah: Lawrence Erlbaum Associates.Google Scholar
- 11.Wright, B. D. (1992). IRT in the 1990s: Which models work best? Rasch Measurement Transactions, 6, 196–200.Google Scholar
- 17.Haley, S. M., Ni, P. S., Ludlow, L. H., & Fragala-Pinkham, M. A. (2006). Measurement precision and efficiency of multidimensional computer adaptive testing of physical functioning using the pediatric evaluation of disability inventory. Archives of Physical Medicine and Rehabilitation, 87, 1223–1229.PubMedCrossRefGoogle Scholar
- 19.Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (p. 278). Mahwah: Lawrence Erlbaum Associates.Google Scholar
- 22.Alonso, J., Angermeyer, M. C., Bernert, S. et al. (2004). Prevalence of mental disorders in Europe: Results from the European Study of the Epidemiology of Mental Disorders (ESEMeD) project. Acta Psychiatrica Scandinavica Suppl 420, 21–27.Google Scholar