Computational Statistics

, Volume 15, Issue 3, pp 421–442 | Cite as

Exploring the posterior of a hierarchical IRT model for item effects

  • Rianne Janssen
  • Paul De Boeck


A one-way ANOVA structure is imposed on the item difficulty and the item discrimination parameter of a two-parameter hierarchical IRT model for item effects. Bayesian estimation of the model is illustrated for the Metropolis-Hastings within Gibbs and the data augmented Gibbs procedure. The posterior of the hierarchical IRT model is explored with respect to the location of parameters and the uncertainty of these parameter estimates. The posterior correlations among parameters are shown to be due to trade-off effects among parameters either on the same parameter scales or on different parameter scales.


Gibbs Sampler Hierarchical Modeling IRT Item Effects Posterior Correlations 


  1. Albert, J. H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269CrossRefGoogle Scholar
  2. Baker, F. B. (1998). An investigation of the item parameter recovery characteristics of a Gibbs sampling procedure. Applied Psychological Measurement, 22, 153–169.CrossRefGoogle Scholar
  3. Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.CrossRefGoogle Scholar
  4. Chib, S., & Greenberg, E. (1995). Understanding the Metropolis-Hastings Algorithm. The American Statistician, 49, 327–335.Google Scholar
  5. De Boeck, P., Daems, F., Meulders, M., & Rymenams, R. (1997). Ontwikkeling van een toets voor de eindtermen begrijpend lezen [Construction of a test for the educational targets of reading comprehension]. Leuven/Antwerp (Belgium): University of Leuven/University of Antwerp.Google Scholar
  6. Embretson, S. E. (1998). A cognitive design system approach to generation valid tests: Application to abstract reasoning. Psychological Methods, 3, 380–396.CrossRefGoogle Scholar
  7. Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (1995). Bayesian Data Analysis. London: Chapman & Hall.CrossRefGoogle Scholar
  8. Gelman, A., Roberts, G. O., & Gilks, W. R. (1996). Efficient Metropolis jumping rules. In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian Statistics 5: Proceedings of the Fifth Valencia International Meeting (pp. 599–608). New York: Oxford.Google Scholar
  9. Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science, 7, 457–511.CrossRefGoogle Scholar
  10. Gilks, W., Richardson, S. & Spiegelhalter, D. (eds.) (1996). Markov Chain Monte Carlo in practice. New York: Chapman & Hall.zbMATHGoogle Scholar
  11. Janssen, R., Tuerlinckx, F., Meulders, M. & De Boeck, P. (in press). A hierarchical IRT model for criterion-referenced measurement. Journal of Educational and Behavioral Statistics.Google Scholar
  12. Kass, R. E., Carlin, B. P., Gelman, A., & Neal, R. N. (1997). Markov chain Monte Carlo in practice: a roundtable discussion. The American Statistician, 52, 93–100.MathSciNetGoogle Scholar
  13. Lord, F. M. (1975). The’ ability’ scale in item characteristic curve theory. Psychometrika, 40, 205–217.CrossRefGoogle Scholar
  14. Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.zbMATHGoogle Scholar
  15. Patz, R. J. & Junker, B. W. (1999). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.CrossRefGoogle Scholar
  16. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: The Danish Institute of Educational Research. (Expanded edition, 1980. Chicago: The University of Chicago Press.)Google Scholar
  17. Spiegelhalter, D. J., Best, N. G., Gilks, W. R., & Inskip, H. (1996). Hepatitis B: a case study in MCMC methods. In W. R. Gilks, S. Richardson, & D. J. Spiegelhalter (Eds.). Markov chain Monte Carlo in practice (pp. 21–43). New York: Chapman & Hall.zbMATHGoogle Scholar
  18. Stocking, M. L. (1989). Empirical estimation errors in item response theory as a function of test properties (Research Report 89-5). Educational Testing Service, Princeton, NJ.Google Scholar
  19. Tanner, M. A. (1996). Tools for statistical inference: Methods for the exploration of posterior distributions and likelihood functions(3rd ed.). New York: Springer.CrossRefGoogle Scholar
  20. van der Linden, W. J., & Hambleton, R. K. (Eds.) (1997). Handbook of modern item response theory. New York: Springer.zbMATHGoogle Scholar
  21. Wingersky, M. S. & Lord, F. M. (1984). An investigation of methods for reducing sampling error in certain IRT procedures. Applied Psychological Methods, 8, 347–364.CrossRefGoogle Scholar

Copyright information

© Physica-Verlag 2000

Authors and Affiliations

  • Rianne Janssen
    • 1
  • Paul De Boeck
    • 1
  1. 1.Department of PsychologyUniversity of LeuvenLeuvenBelgium

Personalised recommendations