Historical Background for Multidimensional Item Response Theory (MIRT)

  • Mark D. Reckase
Part of the Statistics for Social and Behavioral Sciences book series (SSBS)


Multidimensional item response theory (MIRT) is the result of the convergence of ideas from a number of areas in psychology, education, test development, psychometrics, and statistics. Two general themes underlie the influence of these ideas on the development of MIRT. The first theme is that as our understanding of these areas increases, it becomes clear that things are more complicated than originally thought. The second theme is that the complexity can be represented by models or theories, but these theories and models are idealizations of reality. Because they are idealizations, they can likely be proven false if tested using a large number of observations. Nevertheless, the models can give useful approximations with many practical applications.

It is common in the development of scientific theories to collect data about a phenomenon and then to develop an idealized model of the phenomenon that is consistent with the data. The idealized model is usually presented as a mathematical equation. An example of this approach to theory development is reported in Asimov (1972, p. 158). He describes Galileo in a church observing the swing of lamps hanging from the ceiling by long chains. These lamps were swinging like pendulums, and Galileo is reported to have recorded the length of time it took to make one full swing using his own pulse rate. From these observations, he developed a mathematical formula that related the length of the chain to the length of time for each swing (period) of a pendulum.


Correct Response Test Item Differential Item Functioning Item Response Theory Item Parameter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Ackerman TA (1992) A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement 29:67–91CrossRefGoogle Scholar
  2. Asimov I (1972) Asimov’s new guide to science. Basic Books, New YorkGoogle Scholar
  3. Bejar II (1977) An application of the continuous response level model to personality measurement. Applied Psychological Measurement 1:509–521CrossRefGoogle Scholar
  4. Binet A, Simon T (1913) A method of measuring the development of intelligence in children (Translated from the French by CH Town). Chicago Medical Book Company, ChicagoGoogle Scholar
  5. Bock RD, Gibbons R, Muraki E (1988) Full information item factor analysis. Applied Psychological Measurement 12:261–280CrossRefGoogle Scholar
  6. Burgess MA (1921) The measurement of silent reading. Russell Sage Foundation, New YorkGoogle Scholar
  7. Camilli G, Wang M, Fesq J (1995) The effects of dimensionality on equating the Law School Admissions Test. Journal of Educational Measurement 32:79–96CrossRefGoogle Scholar
  8. Carroll JB (1945) The effect of difficulty and chance success on correlations between items or between tests. Psychometrika 10:1–19CrossRefGoogle Scholar
  9. Carroll JB (1993) Human cognitive abilities: A survey of factor analytic studies. Cambridge University Press, New YorkCrossRefGoogle Scholar
  10. Christoffersson A (1975) Factor analysis of dichotomized variables. Psychometrika 40:5–32CrossRefMATHMathSciNetGoogle Scholar
  11. Davey TC, Oshima TC (1994) Linking multidimensional calibrations. Paper presented at the annual meeting of the National Council on Measurement in Education, New OrleansGoogle Scholar
  12. Deese J (1958) The psychology of learning (2nd edition). McGraw-Hill, New YorkGoogle Scholar
  13. Ebbinghaus H (1885) Uber das Gedachtnis: Untersuchungen zur experimentalen Psychologie. Duncker and Humbolt, LeipzigGoogle Scholar
  14. Fischer GH, Molenaar IW (eds) (1995) Rasch models: Foundations, recent developments, and applications. Springer-Verlag, New YorkMATHGoogle Scholar
  15. Galton F (1870) Hereditary genius: An inquiry into its laws and consequences. D. Appleton, LondonGoogle Scholar
  16. Glas CAW (1992) A Rasch model with a multivariate distribution of ability. In Wilson M (ed) Objective measurement: Theory into practice volume 1. Ablex, Norwood, NJGoogle Scholar
  17. Glas CAW, Vos HJ (2000) Adaptive mastery testing using a multidimensional IRT model and Bayesian sequential decision theory (Research Report 00-06). University of Twente, Enschede, The NetherlandsGoogle Scholar
  18. Gulliksen H (1950) Theory of mental tests. Wiley, New YorkGoogle Scholar
  19. Kelderman H (1994) Objective measurement with multidimensional polytomous latent trait models. In Wilson M (ed) Objective measurement: Theory into practice, Vol. 2. Ablex, Norwood NJGoogle Scholar
  20. Kirisci L, Hsu T, Yu L (2001) Robustness of item parameter estimation programs to assumptions of unidimensionality and normality. Applied Psychological Measurement 25:146–162CrossRefMathSciNetGoogle Scholar
  21. Lord FM (1980) Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates, Hillsdale, NJGoogle Scholar
  22. Lord FM, Novick MR (1968) Statistical theories of mental test scores. Addison-Wesley, Reading, MAMATHGoogle Scholar
  23. McCall WA (1922) How to measure in education. The Macmillan Company, New YorkGoogle Scholar
  24. McDonald RP (1985) Factor analysis and related methods. Lawrence Erlbaum Associates, Hillsdale, NJGoogle Scholar
  25. McKinley RL, Reckase MD (1982) The use of the general Rasch model with multidimensional item response data (Research Report ONR 82-1). American College Testing, Iowa City, IAGoogle Scholar
  26. Miller TR, Hirsch TM (1992) Cluster analysis of angular data in applications of multidimensional item response theory. Applied Measurement in Education 5:193–211CrossRefGoogle Scholar
  27. Millman J, Greene J (1989) The specification and development of tests of achievement and ability. In Linn RL (ed) Educational measurement (3rd edition). American Council on Education and Macmillan, New YorkGoogle Scholar
  28. Mulaik SA (1972) A mathematical investigation of some multidimensional Rasch models for psychological tests. Paper presented at the annual meeting of the Psychometric Society, Princeton, NJGoogle Scholar
  29. Rasch G (1960) Probabilistic models for some intelligence and attainment tests. Danmarks Paedagogiske Institut, CopenhagenGoogle Scholar
  30. Rasch G (1962) On general laws and the meaning of measurement in psychology. Proceedings of the fourth Berkeley symposium on mathematical statistics and probability 4:321–334Google Scholar
  31. Reckase MD (1972) Development and application of a multivariate logistic latent trait model. Unpublished doctoral dissertation, Syracuse University, Syracuse, NYGoogle Scholar
  32. Reckase MD (1985) The difficulty of test items that measure more than one ability. Applied Psychological Measurement 9:401–412CrossRefGoogle Scholar
  33. Reckase MD, Ackerman TA, Carlson JE (1988) Building a unidimensional test using multidimensional items. Journal of Educational Measurement 25:193–204CrossRefGoogle Scholar
  34. Reckase MD, Hirsch TM (1991) Interpretation of number-correct scores when the true numbers of dimensions assessed by a test is greater than two. Paper presented at the annual meeting of the National Council on Measurement in Education, ChicagoGoogle Scholar
  35. Reckase MD, McKinley RL (1991) The discriminating power of items that measure more than one dimension. Applied Psychological Measurement 15:361–373CrossRefGoogle Scholar
  36. Samejima F (1974) Normal ogive model on the continuous response level in the multidimensional space. Psychometrika 39:111–121CrossRefMATHMathSciNetGoogle Scholar
  37. Stern W (1914) The psychological methods of testing intelligence. Warwick & York, BaltimoreCrossRefGoogle Scholar
  38. Sympson JB (1978) A model for testing with multidimensional items. In Weiss DJ (ed) Proceedings of the 1977 Computerized Adaptive Testing Conference, University of Minnesota, MinneapolisGoogle Scholar
  39. Thissen D, Wainer H (2001) Test scoring. Lawrence Erlbaum Associates, Mahwah, NJGoogle Scholar
  40. Thorndike EL (1904) An introduction to the theory of mental and social measurements. The Science Press, New YorkGoogle Scholar
  41. van der Linden WJ (2005) Linear models of optimal test design. Springer, New YorkGoogle Scholar
  42. van der Linden WJ, Hambleton RK (eds.) (1997) Handbook of modern item response theory. Springer, New YorkMATHGoogle Scholar
  43. Vernon P (1950) The structure of human abilities. Methuen, LondonGoogle Scholar
  44. Wang W, Wilson M (2005) Exploring local item dependence using a random-effects facet model. Applied Psychological Measurement 29:296–318CrossRefMathSciNetGoogle Scholar
  45. Whipple GM (1910) Manual of mental and physical tests. Warwick & York, BaltimoreCrossRefGoogle Scholar
  46. Whitely SE (1980b) Multicomponent latent trait models for ability tests. Psychometrika 45: 479–494CrossRefMATHGoogle Scholar
  47. Wilson M, Adams R (1995) Rasch models for item bundles. Psychometrika 60:181–198CrossRefMATHGoogle Scholar
  48. Yoakum CS, Yerkes RM (1920) Army mental tests. Henry Holt, New YorkCrossRefGoogle Scholar
  49. Harman HH (1976) Modern factor analysis (3rd edition revised). The University of Chicago Press, ChicagoGoogle Scholar
  50. Muthén B (1978) Contributions to factor analysis of dichotomous variables. Psychometrika 43:551–560CrossRefMATHMathSciNetGoogle Scholar
  51. Horst P (1965) Factor analysis of data matrices. Holt, Rinehart & Winston, New YorkGoogle Scholar
  52. McDonald RP (1967) Nonlinear factor analysis. Psychometric Monograph 15Google Scholar
  53. Rasch G (1960) Probabilistic models for some intelligence and attainment tests. Danmarks Paedagogiske Institut, CopenhagenGoogle Scholar
  54. Rijmen F, De Boeck P (2005) A relationship between a between-item multidimensional IRT model and the mixture Rasch model. Psychometrika 70:481–496CrossRefMathSciNetGoogle Scholar
  55. Healy AF, McNamara DS (1996) Verbal learning and memory: Does the modal model still work? In Spence JT, Darley JM, Foss DJ (eds) Annual Review of Psychology 47:143–172Google Scholar
  56. Kallon AA (1916) Standards in silent reading. Boston Department of Educational Investigation and Measurement Bulletin No. 12, School Document 18. BostonGoogle Scholar
  57. Perie M, Grigg W, Donahue P (2005) The Nation’s Report Card: Reading 2005 (NCES 2006-451). U.S. Department of Education, National Center for Education Statistics, U.S. Government Printing Office, Washington, DCGoogle Scholar
  58. McDonald RP (1999) Test theory: A unified treatment. Lawrence Erlbaum Associates, Mahwah, NJGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Counseling, Educational, Psychology, and Special Education DepartmentMichigan State UniversityEast LansingUSA

Personalised recommendations