Item Parameter Estimation and Item Fit Analysis

  • Cees A. W. Glas
Part of the Statistics for Social and Behavioral Sciences book series (SSBS)


Computer-based testing (CBT), as computerized adaptive testing (CAT), is based on the availability of a large pool of calibrated test items. Usually, the calibration process consists of two stages.


Item Response Theory Item Parameter Item Bank Item Response Theory Model Computerize Adaptive Testing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Ackerman, T. A. (1996a). Developments in multidimensional item response theory. Applied Psychological Measurement, 20, 309–310.CrossRefGoogle Scholar
  2. Ackerman, T. A. (1996b). Graphical representation of multidimensional item response theory analyses. Applied Psychological Measurement, 20, 311–329.CrossRefGoogle Scholar
  3. Aitchison, J. & Silvey, S. D. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals of Mathematical Statistics, 29, 813–828.MATHCrossRefMathSciNetGoogle Scholar
  4. Bock, R. D. & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM-algorithm. Psychometrika, 46, 443–459.CrossRefMathSciNetGoogle Scholar
  5. Bock, R. D., Gibbons, R. D. & Muraki, E. (1988). Full-information factor analysis. Applied Psychological Measurement, 12, 261–280.CrossRefGoogle Scholar
  6. Bock, R. D. & Zimowski, M. F. (1997). Multiple group IRT. In W. J. van der Linden and R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 433–448). New York: Springer-Verlag.Google Scholar
  7. Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. B. 39, 1–38.MATHMathSciNetGoogle Scholar
  8. Efron, B. (1977). Discussion on maximum likelihood from incomplete data via the EM algorithm (by A. Dempster, N. Laird, and D. Rubin). J. R. Statist. Soc. B., 39, 1–38.Google Scholar
  9. Fischer, G. H. & Scheiblechner, H. H. (1970). Algorithmen und Programme für das probabilistische Testmodell von Rasch. Psychologische Beiträge, 12, 23–51.Google Scholar
  10. Glas, C. A. W. (1992). A Rasch model with a multivariate distribution of ability. In M. Wilson (Ed.), Objective measurement: Theory into practice (Vol. 1) (pp. 236–258). Norwood, NJ: Ablex Publishing Corporation.Google Scholar
  11. Glas, C. A. W. (1988). The Rasch Model and multi-stage testing. Journal of Educational Statistics, 13, 45–52.CrossRefGoogle Scholar
  12. Glas, C. A. W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667.MATHMathSciNetGoogle Scholar
  13. Glas, C. A. W. (1999). Modification indices for the 2-PL and the nominal response model. Psychometrika, 64, 273–294.CrossRefGoogle Scholar
  14. Glas, C. A. W. & Suarez-Falcon, J.C. (2003). A comparison of item-fit statistics for the three-parameter logistic model. Applied Psychological Measurement, 27, 87–106.CrossRefMathSciNetGoogle Scholar
  15. Kiefer, J. & Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Annals of Mathematical Statistics, 27, 887–903.MATHCrossRefMathSciNetGoogle Scholar
  16. Kingsbury, G. G. & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2, 359–375.CrossRefGoogle Scholar
  17. Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.MATHMathSciNetGoogle Scholar
  18. Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49, 359–381.MATHCrossRefGoogle Scholar
  19. Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177–195.MATHCrossRefMathSciNetGoogle Scholar
  20. Mislevy, R. J. & Chang, H.-H. (2000). Does adaptive testing violate local independence? Psychometrika, 65, 149–156.CrossRefMathSciNetGoogle Scholar
  21. Mislevy, R. J. & Wu, P. K. (1996). Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing (Research Report RR-96-30-ONR). Princeton, NJ: Educational Testing Service.Google Scholar
  22. Neyman, J. & Scott, E. L. (1948). Consistent estimates, based on partially consistent observations. Econometrica, 16, 1–32.CrossRefMathSciNetGoogle Scholar
  23. Rao, C. R. (1947). Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 50–57.Google Scholar
  24. Reckase, M. D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401–412.CrossRefGoogle Scholar
  25. Reckase, M. D. (1997). A linear logistic multidimensional model for dichotomous item response data. In W. J. van der Linden and R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 271–286). New York: Springer-Verlag.Google Scholar
  26. Rigdon S. E. & Tsutakawa, R. K. (1983). Parameter estimation in latent trait models. Psychometrika, 48, 567–574.MATHCrossRefMathSciNetGoogle Scholar
  27. Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.MATHCrossRefMathSciNetGoogle Scholar
  28. Stocking, M. L. (1993). Controlling exposure rates in a realistic adaptive testing paradigm (Research Report 93-2). Princeton, NJ: Educational Testing Service.Google Scholar
  29. Stocking, M. L. & Swanson, L. (1993). A method for severely constrained item selection in adaptive testing. Applied Psychological Measurement, 17, 277–292.CrossRefGoogle Scholar
  30. Sympson, J. B. & Hetter, R. D. (1985). Controlling item-exposure rates in computerized adaptive testing. Proceedings of the 27th Annual meeting of the Military Testing Association (pp. 973–977). San Diego: Navy Personnel Research and Development Center.Google Scholar
  31. Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47, 175–186.MATHCrossRefGoogle Scholar
  32. Veerkamp, W. J. J. (1996). Statistical methods for computerized adaptive testing. Unpublished doctoral thesis, Twente University, the Netherlands.Google Scholar
  33. Wetherill, G. B. (1977). Sampling inspection and statistical quality control (2nd ed.). London: Chapman and Hall.Google Scholar
  34. Wilson, D. T., Wood, R. & Gibbons, R. D. (1991) TESTFACT: Test scoring, item statistics, and item factor analysis (computer software). Chicago: Scientific Software International, Inc.Google Scholar
  35. Wright, B. D. & Panchapakesan, N. (1969). A procedure for sample-free item analysis. Educational and Psychological Measurement, 29, 23–48.CrossRefGoogle Scholar
  36. Zimowski, M. F., Muraki, E., Mislevy, R. J. & Bock, R. D. (1996). Bilog MG: Multiple-group IRT analysis and test maintenance for binary items. Chicago: Scientific Software International, Inc.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Cees A. W. Glas
    • 1
  1. 1.Department of Research Methodology, Measurement, and Data AnalysisUniversity of TwenteEnschedeThe Netherlands

Personalised recommendations