Item Response Theory: Brief History, Common Models, and Extensions

  • Wim J. van der Linden
  • Ronald K. Hambleton


Long experience with measurement instruments such as thermometers, yardsticks, and speedometers may have left the impression that measurement instruments are physical devices providing measurements that can be read directly off a numerical scale. This impression is certainly not valid for educational and psychological tests. A useful way to view a test is as a series of small experiments in which the tester records a vector of responses by the testee. These responses are not direct measurements, but provide the data from which measurements can be inferred.


Item Response Theory Test Theory Item Parameter Item Response Theory Model Classical Test Theory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Agresti, A. (1990). Categorical Data Analysis. New York, NY: Wiley.zbMATHGoogle Scholar
  2. Andersen, E.B. (1973). A goodness of fit test for the Rasch model. Psychometrika 38, 123–140.MathSciNetzbMATHCrossRefGoogle Scholar
  3. Andersen, E.B. (1980). Discrete Statistical Models with Social Science Applications. Amsterdam, The Netherlands: North-Holland.zbMATHGoogle Scholar
  4. Baker, F.B. (1992). Item Response Theory: Parameter Estimation Techniques. New York, NY: Marcel Dekker.zbMATHGoogle Scholar
  5. Berkson, J.A. (1944). Application of the logistic function to bio-assay. Journal of the American Statistical Association 39, 357–365.Google Scholar
  6. Berkson, J.A. (1951). Why I prefer logits to probits. Biometrics 7, 327–329.CrossRefGoogle Scholar
  7. Berkson, J.A. (1953). A statistically precise and relatively simple method of estimating the bioassay with quantal response, based on the logistic function. Journal of the American Statistical Association 48, 565–600.zbMATHGoogle Scholar
  8. Berkson, J.A. (1955). Maximum likelihood and minimum chi-square estimates of the logistic function. Journal of the American Statistical Association 50, 120–162.Google Scholar
  9. Binet, A. and Simon, Th.A. (1905). Méthodes nouvelles pour le diagnostic du niveau intellectuel des anormaux. l’Année Psychologie 11, 191–336.CrossRefGoogle Scholar
  10. Bock, R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika 37, 29–51.MathSciNetzbMATHCrossRefGoogle Scholar
  11. Bock, R.D. and Aitken, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of an EM algorithm. Psychometrika 46, 443–459.MathSciNetCrossRefGoogle Scholar
  12. Bock, R.D. and Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika 35, 179–197.CrossRefGoogle Scholar
  13. Christoffersson, A. (1975). Factor analysis of dichotomized variables. Psychometrika 40, 5–32.MathSciNetzbMATHCrossRefGoogle Scholar
  14. Cox, D.R. (1958). The Planning of Experiments. New York, NY: Wiley.Google Scholar
  15. de Leeuw, J. and Verhelst, N.D. (1986). Maximum-likelihood estimation in generalized Rasch models. Journal of Educational Statistics 11, 183–196.CrossRefGoogle Scholar
  16. Engelen, R.H.J. (1989). Parameter Estimation in the Logistic Item Response Model. Unpublished doctoral dissertation, University of Twente, Enschede, The Netherlands.Google Scholar
  17. Fennessy, L.M. (1995). The Impact of Local Dependencies on Various IRT Outcomes. Unpublished doctoral dissertation, University of Massachusetts, Amherst.Google Scholar
  18. Ferguson, G.A. (1942). Item selection by the constant process. Psychometrika 7, 19–29.CrossRefGoogle Scholar
  19. Fischer, G.H. (1974). Einführung in die Theorie psychologischer Tests. Bern, Switzerland: Huber.Google Scholar
  20. Fischer, G.H. (1983). Logistic latent trait models with linear constraints. Psychometrika 48, 3–26.MathSciNetzbMATHCrossRefGoogle Scholar
  21. Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika 53, 525–546.MathSciNetzbMATHCrossRefGoogle Scholar
  22. Glas, C.A.W. (1989). Contributions to Estimating and Testing Rasch Models. Unpublished doctoral dissertation, University of Twente, Enschede, The Netherlands.Google Scholar
  23. Haley, D.C. (1952). Estimation of the Dosage Mortality Relationship When the Dose is Subject to Error (Technical Report No. 15 ). Palo Alto, CA: Applied Mathematics and Statistics Laboratory, Stanford University.Google Scholar
  24. Hambleton, R.K. (1989). Principles and selected applications of item response theory. In R.L. Linn (ed.), Educational Measurement ( 3rd ed., pp. 143–200 ). New York, NY: Macmillan.Google Scholar
  25. Hambleton, R.K. and Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Boston: Kluwer Academic Publishers.Google Scholar
  26. Hambleton, R.K., Swaminathan, H., and Rogers, H.J. (1991). Fundamentals of Item Response Theory. Newbury Park, CA: Sage.Google Scholar
  27. Hattie, J. (1985). Assessing unidimensionality of tests and items. Applied Psychological Measurement 9, 139–164.CrossRefGoogle Scholar
  28. Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika 49, 223–245.zbMATHCrossRefGoogle Scholar
  29. Lawley, D.N. (1943). On problems connected with item selection and test construction. Proceedings of the Royal Society of Edinburgh 61, 273–287.MathSciNetzbMATHGoogle Scholar
  30. Lazarsfeld, P.F. (1950). Chapters 10 and 11 in S.A. Stouffer et al. (eds.), Studies in Social Psychology in World War II: Vol. 4. Measurement and Prediction. Princeton, NJ: Princeton University Press.Google Scholar
  31. Lazarsfeld, P.F. and Henry, N.W. (1968). Latent Structure Analysis. Boston, MA: Houghton Mifflin.zbMATHGoogle Scholar
  32. Lehmann, E.L. (1959). Testing Statistical Hypotheses. New York, NY: Wiley.zbMATHGoogle Scholar
  33. Liou, M. (1994). More on the computation of higher-order derivatives of the elementary symmetric functions in the Rasch model. Applied Psychological Measurement 18, 53–62.CrossRefGoogle Scholar
  34. Loevinger, J. (1947). A systematic approach to the construction and evaluation of tests of ability. Psychological Monographs 61 (Serial No. 285).Google Scholar
  35. Lord, F.M. (1952). A theory of test scores. Psychometric Monographs, No. 7.Google Scholar
  36. Lord, F.M. (1980). Applications of Item Response Theory to Practical Testing Problems. Hillsdale, NJ: Erlbaum.Google Scholar
  37. Lord, F.M. and Novick, M.R. (1968). Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley.Google Scholar
  38. Masters, G.N. and Wright, B.D. (1984). The essential process in a family of measurement models. Psychometrika 49, 529–544.CrossRefGoogle Scholar
  39. McCullagh, P. and Neider, J.A. (1989). Generalized Linear Models ( 2nd edition ). London: Chapman and Hill.zbMATHGoogle Scholar
  40. McDonald, R.P. (1967). Nonlinear factor analysis. Psychometric Monograph, No. 15.Google Scholar
  41. McDonald, R.P. (1989). Future directions for item response theory. International Journal of Educational Research 13, 205–220.CrossRefGoogle Scholar
  42. Mellenbergh, G.J. (1994). Generalized linear item response theory. Psychological Bulletin 115, 300–307.CrossRefGoogle Scholar
  43. Mellenbergh, G.J. (1995). Conceptual notes on models for discrete polyto- mous item responses. Applied Psychological Measurement 19, 91–100.CrossRefGoogle Scholar
  44. Mislevy, R.L. (1986). Bayes modal estimation in item response theory. Psychometrika 51, 177–195.MathSciNetzbMATHCrossRefGoogle Scholar
  45. Molenaar, W. (1974). De logistische en de normale kromme [The logistic and the normal curve]. Nederlands Tijdschrift voor de Psychologie 29, 415–420.Google Scholar
  46. 1.
    Item Response Theory 27Google Scholar
  47. Molenaar, W. (1983). Some improved diagnostics for failure in the Rasch model. Psychometrika 55, 75–106.MathSciNetCrossRefGoogle Scholar
  48. Mosier, C.I. (1940). Psychophysics and mental test theory: Fundamental postulates and elementary theorems. Psychological Review 47, 355–366.CrossRefGoogle Scholar
  49. Mosier, C.I. (1941). Psychophysics and mental test theory. II. The constant process. Psychological Review 48, 235–249.CrossRefGoogle Scholar
  50. Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika 43, 551–560.MathSciNetzbMATHCrossRefGoogle Scholar
  51. Novick, M.R. (1966). The axioms and principal results of classical test theory. Journal of Mathematical Psychology 3, 1–18.MathSciNetzbMATHCrossRefGoogle Scholar
  52. Rasch, G. (1960). Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen, Denmark: Danish Institute for Educational Research.Google Scholar
  53. Rasch, G. (1961). On general laws and the meaning of measurement in psychology. Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability (Vol. 4, pp. 321–333 ). Berkeley, CA: University of California.Google Scholar
  54. Richardson, M.W. (1936). The relationship between difficulty and the differential validity of a test. Psychometrika 1, 33–49.zbMATHCrossRefGoogle Scholar
  55. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monograph, No. 17.Google Scholar
  56. Samejima, F. (1972). A general model for free-response data. Psychometric Monograph, No. 18.Google Scholar
  57. Samejima, F. (1973). A comment on Birnbaum’s three-parameter logistic model in the latent trait theory. Psychometrika 38, 221–233.zbMATHCrossRefGoogle Scholar
  58. Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology 15, 72–101.CrossRefGoogle Scholar
  59. Swaminathan, H. and Gifford, J.A. (1982). Bayesian estimation in the Rasch model. Journal of Educational Statistics 7, 175–192.CrossRefGoogle Scholar
  60. Swaminathan, H. and Gifford, J.A. (1985). Bayesian estimation in the two-parameter logistic model. Psychometrika 50, 349–364.zbMATHCrossRefGoogle Scholar
  61. Swaminathan, H. and Gifford, J.A. (1986). Bayesian estimation in the three-parameter logistic model. Psychometrika 51, 589–601.MathSciNetzbMATHCrossRefGoogle Scholar
  62. Takane, Y. and de Leeuw, J. (1987). On the relationship between item response theory and factor analysis of discretized variables. Psychometrika 52, 393–408.MathSciNetzbMATHCrossRefGoogle Scholar
  63. Tanner, M.A. (1993). Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions. New York, NY: Springer-Verlag.zbMATHGoogle Scholar
  64. Thissen, D. (1982). Marginal maximum-likelihood estimation for the one-parameter logistic model. Psychometrika 47, 175–186.zbMATHCrossRefGoogle Scholar
  65. Thissen, D. and Steinberg, L. (1984). Taxonomy of item response models. Psychometrika 51, 567–578.CrossRefGoogle Scholar
  66. Thurstone, L.L. (1925). A method of scaling psychological and educational tests. Journal of Educational Psychology 16, 433–451.CrossRefGoogle Scholar
  67. Thurstone, L.L. (1927a). The unit of measurements in educational scales. Journal of Educational Psychology 18, 505–524.CrossRefGoogle Scholar
  68. Thurstone, L.L. (1927b). A law of comparative judgement. Psychological Review 34, 273–286.CrossRefGoogle Scholar
  69. Tsutakawa, R.K. (1984). Estimation of two-parameter logistic item response curves. Journal of Educational Statistics 9, 263–276.CrossRefGoogle Scholar
  70. Tsutakawa, R.K. and Lin, H.Y. (1986). Bayesian estimation of item response curves. Psychometrika 51, 251–267.MathSciNetzbMATHCrossRefGoogle Scholar
  71. Tucker, L.R. (1946). Maximum validity of a test with equivalent items. Psychometrika 11, 1–13.MathSciNetzbMATHCrossRefGoogle Scholar
  72. Urry, V.W. (1974). Approximations to item parameters of mental test models. Journal of Educational Measurement 34, 253–269.CrossRefGoogle Scholar
  73. van Engelenburg, G. (1995). Step Approach and Polytomous Items (internal report). Amsterdam, The Netherlands: Department of Methodology, Faculty of Psychology, University of Amsterdam.Google Scholar
  74. van der Linden, W.J. (1986). The changing conception of testing in education and psychology. Applied Psychological Measurement 10, 325–352.CrossRefGoogle Scholar
  75. van den Wollenberg, A.L. (1982). Two new test statistics for the Rasch model. Psychometrika 47, 123–139.zbMATHCrossRefGoogle Scholar
  76. Verhelst, N.D., Glas, C.A.W. and van der Sluis, A. (1984). Estimation problems in the Rasch model: The basic symmetric functions. Computational Statistics Quarterly 1, 245–262.MathSciNetGoogle Scholar
  77. Verhelst, N.D. and Molenaar, W. (1988). Logit-based parameter estimation in the Rasch model. Statistica Neerlandica 42, 273–295.MathSciNetzbMATHCrossRefGoogle Scholar
  78. Yen, W.M. (1981). Using simulation results to choose a latent trait model. Applied Psychological Measurements 5, 245–262.CrossRefGoogle Scholar
  79. Yen, W.M., Burket, G.R. and Sykes, R.C. (1991). Nonunique solutions to the likelihood equation for the three-parameter logistic model. Psychometrika 56, 39–54.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1997

Authors and Affiliations

  • Wim J. van der Linden
  • Ronald K. Hambleton

There are no affiliations available

Personalised recommendations