Estimation of the Parameters in an Item-Cloning Model for Adaptive Testing

  • Cees A. W. Glas
  • Wim J. van der Linden
  • Hanneke Geerlings
Part of the Statistics for Social and Behavioral Sciences book series (SSBS)


Item response theory (IRT) models with random person parameters have become a common choice among practitioners in the field of educational and psychological measurement. Though initially the choice for such models was motivated by an attempt to get rid of the statistical problems inherent in the incidental nature of the person parameters (Bock & Lieberman, 1970), the insight soon emerged that such models more adequately represent cases where the focus is not on the measurement of individual persons but on the estimation of characteristics of populations. Early examples of models with random person parameters in the literature are those proposed by Andersen and Madsen (1977) and Sanathanan and Blumenthal (1978), who were interested in estimates of the mean and variance in a population of persons, and by Mislevy (1991), who provided tools for inference from a response model with a regression structure on the person parameters introduced to account for sampling persons differing background variables.


Posterior Distribution Markov Chain Monte Carlo Item Response Theory Item Parameter Item Response Theory Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Albers, W., Does, R. J. M. M., Imbos, T. & Janssen, M. P. E. (1989). A stochastic growth model applied to repeated tests of academic knowledge. Psychometrika, 54, 451–466.CrossRefMathSciNetGoogle Scholar
  2. Albert, J. H. (1992). Bayesian estimation of normal-ogive item response curves using Gibbs sampling. Journal of Educational and Behavioral Statistics, 17, 251–269.CrossRefGoogle Scholar
  3. Andersen, E. B. & Madsen, M. (1977). Estimating the parameters of the latent population distribution. Psychometrika, 42, 357–374.CrossRefMathSciNetMATHGoogle Scholar
  4. Béguin, A. A. & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–562.CrossRefMathSciNetGoogle Scholar
  5. Bejar, I. I. (1993). A generative approach to psychological and educational measurement. In N. Frederiksen, R. J. Mislevy & I. I. Bejar (Eds.), Test theory for a new generation of tests (pp. 323–357). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  6. Berger, M. P. F. (1997). Optimal designs for latent variable models: A review. In J. Rost & R. Langeheine (Eds.), Applications of latent trait and latent class models in the social sciences (pp. 71–79). Münster, Germany: Waxmann.Google Scholar
  7. Bock, R. D. & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: application of an EM-algorithm. Psychometrika, 46, 443–459.CrossRefMathSciNetGoogle Scholar
  8. Bock, R. D. & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179–197.CrossRefGoogle Scholar
  9. Box, G. E. P. & Tiao, G. C. (1973). Bayesian inference in statistical analysis. Reading, MA: Addison-Wesley.MATHGoogle Scholar
  10. Bradlow, E. T., Wainer, H. & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.CrossRefGoogle Scholar
  11. de Boeck, P. (2008). Random item IRT models. Psychometrika, 73, 533–559.CrossRefMATHGoogle Scholar
  12. de Jong, M. G., Steenkamp, J. B. E. M. & Fox, J. P. (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research, 34, 260–278.CrossRefGoogle Scholar
  13. Efron, B. (1977). Discussion on maximum likelihood from incomplete data via the EM algorithm (by A. P. Demster, N. M. Laird and D. B. Rubin). Journal of the Royal Statistical Society (Series B), 39, 1–38.Google Scholar
  14. Fox, J. P. & Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 271–288. CrossRefMathSciNetGoogle Scholar
  15. Geerlings, H., van der Linden, W. J. & Glas, C. A. W. (2009). Modeling rule-based item generation. Manuscript submitted for publication.Google Scholar
  16. Gelfand, A. E. & Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.CrossRefMathSciNetMATHGoogle Scholar
  17. Gelman, A., Carlin, J. B., Stern, H. S. & Rubin, D. B. (1995). Bayesian data analysis. London: Chapman and Hall.Google Scholar
  18. Glas, C. A. W. (1992). A Rasch model with a multivariate distribution of ability. In M. Wilson (Ed.), Objective measurement: Theory into practice (Vol. 1; pp. 236–258). Norwood, NJ: Ablex Publishing Corporation.Google Scholar
  19. Glas, C. A. W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647–667.MathSciNetMATHGoogle Scholar
  20. Glas, C. A. W. (2000). Item calibration and parameter drift. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 183–199). Boston: Kluwer-Nijhof Publishing.Google Scholar
  21. Glas, C. A. W. & van der Linden, W. J. (2001). Modeling variability in item parameters in item response models. (Research Rep. 01-11). Enschede, the Netherlands: University of Twente.Google Scholar
  22. Glas, C. A. W. & van der Linden, W. J. (2003). Computerized adaptive testing with item cloning. Applied Psychological Measurement, 27, 247–261.CrossRefMathSciNetGoogle Scholar
  23. Glas, C. A. W., Wainer, H. & Bradlow, E. T. (2000). MML and EAP estimates for the testlet response model. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 271–287). Boston: Kluwer-Nijhof Publishing.Google Scholar
  24. Janssen, R., Tuerlinckx, F., Meulders, M. & de Boeck, P. (2000). A hierarchical IRT model for criterion-referenced measurement. Journal of Educational and Behavioral Statistics, 25, 285–306.Google Scholar
  25. Johnson, M. S. & Sinharay, S. (2005). Calibration of polytomous item families using Bayesian hierarchical modeling. Applied Psychological Measurement, 29, 369–400.CrossRefMathSciNetGoogle Scholar
  26. Kiefer, J. & Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Annals of Mathematical Statistics, 27, 887–906.CrossRefMathSciNetMATHGoogle Scholar
  27. Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.MATHGoogle Scholar
  28. Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.MathSciNetMATHGoogle Scholar
  29. Millman, J. (1973). Passing score and test lengths for domain-referenced measures. Review of Educational Research, 43, 205–216.Google Scholar
  30. Millman, J. & Westman, R. S. (1989). Computer-assisted writing of achievement test items: Toward a future technology. Journal of Educational Measurement, 26, 177–190.CrossRefGoogle Scholar
  31. Mislevy, R. J. (1986). Bayes modal estimation in item response models. Pychometrika, 51, 177–195.CrossRefMathSciNetMATHGoogle Scholar
  32. Mislevy, R. J. (1991). Randomization-based inferences about latent variables from complex samples. Pychometrika, 56, 177–196.CrossRefMATHGoogle Scholar
  33. Neyman, J. & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica, 16, 1–32.CrossRefMathSciNetGoogle Scholar
  34. Patz, R. J. & Junker, B. W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.Google Scholar
  35. Patz, R. J. & Junker, B. W. (1999b). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.Google Scholar
  36. Robert, C. P. & Casella, G. (1999). Monte Carlo statistical methods. New York: Springer-Verlag.MATHGoogle Scholar
  37. Roid, G. & Haladyna, T. (1982). A technology for test-item writing. New York: Academic Press.Google Scholar
  38. Sanathanan, L. & Blumenthal, S. (1978). The logistic model and estimation of latent structure. Journal of the American Statistical Association, 73, 794–799.CrossRefMATHGoogle Scholar
  39. Shi, J. Q. & Lee, S. Y. (1998). Bayesian sampling based approach for factor analysis models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 51, 233–252.Google Scholar
  40. Sinharay, S., Johnson, M. S. & Williamson, D. M. (2003). Calibrating item families and summarizing the results using family expected response functions. Journal of Educational and Behavioral Statistics, 28, 295–313.CrossRefGoogle Scholar
  41. Stocking, M. L. (1989). Empirical estimation errors in item response theory as a function of test properties (Research Report 89-5). Princeton, NJ: Educational Testing Service.Google Scholar
  42. van der Linden, W. J. (1994). Optimum design in item response theory: Test assembly and item calibration. In G. H. Fischer and D. Laming (Eds.), Contributions to mathematical psychology, psychometrics, and methodology (pp. 305–318). New York: Springer-Verlag.Google Scholar
  43. Wainer, H., Bradlow, E. T. & Du, Z. (2000). Testlet response theory: An analog for the 3PL model useful in testlet-based adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 245–269). Boston: Kluwer-Nijhof Publishing.Google Scholar
  44. Wingersky, M. S. & Lord, F. M. (1984). An investigation of methods for reducing sampling error in certain IRT procedures. Applied Psychological Measurement, 8, 347–364.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Cees A. W. Glas
    • 1
  • Wim J. van der Linden
    • 2
  • Hanneke Geerlings
    • 1
  1. 1.Department of Research Methodology, Measurement, and Data AnalysisUniversity of TwenteEnschedeThe Netherlands
  2. 2.CTB/McGraw-HillMontereyUSA

Personalised recommendations