Gain Scores Revisited Under an IRT Perspective

  • Gerhard H. Fischer
Part of the Lecture Notes in Statistics book series (LNS, volume 157)


For the measurement and statistical assessment of individual gain scores based on item sets that satisfy the assumptions of the Rasch, Rating Scale, or Partial Credit Models, a conditional maximum likelihood estimator, Clopper-Pearson confidence intervals, uniformly most accurate confidence intervals, and uniformly most powerful unbiased tests of the hypothesis of no change are presented. All methods are grounded on the exact conditional distribution of the gain score, given the total score for both time points, so that no asymptotic approximations are required. Typical applications of the methods are mentioned.


Exponential Family Gain Score Category Parameter Partial Credit Model Person Parameter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Andersen, E.B. (1972). The numerical solution of a set of conditional estimation equations. Journal of the Royal Statistical Society, Series B, 34, 42–54.MATHGoogle Scholar
  2. Andersen, E.B. (1990). The statistical analysis of categorical data. Heidelberg: Springer-Verlag.MATHCrossRefGoogle Scholar
  3. Andersen, E.B. (1995). Polytomous Rasch models and their estimation. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 271–291). New York: Springer-Verlag.Google Scholar
  4. Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561–573.MATHCrossRefGoogle Scholar
  5. Bereiter, C. (1963). Some persisting dilemmas in the measurement of change. In C.W. Harris (Ed.), Problems in measuring change (pp. 3–20). Madison, WI: University of Wisconsin Press.Google Scholar
  6. Cronbach, L.J., & Furby, L. (1970). How should we measure change, or should we? Psychological Bulletin, 74, 68–80.CrossRefGoogle Scholar
  7. Embretson, S.E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.MATHCrossRefGoogle Scholar
  8. Fischer, G.H. (1987). Applying the principles of specific objectivity and generalizability to the measurement of change. Psychometrika, 52, 565–578.MathSciNetMATHCrossRefGoogle Scholar
  9. Fischer, G.H. (1995). Some neglected problems in IRT. Psychometrika, 60, 459–487.MATHCrossRefGoogle Scholar
  10. Fischer, G.H., & Ponocny, I. (1995). Extended rating scale and partial credit models for assessing change. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 351–370). New York: Springer-Verlag.Google Scholar
  11. Fischer, G.H., & Ponocny-Seliger, E. (1998). Structural Rasch modeling: Handbook of the usage of LPCM-WIN 1.0 [Software manual]. Groningen: ProGAMMA.Google Scholar
  12. Guttmann, G., & Etlinger, S.C. (1991). Susceptibility to stress and anxiety in relation to performance, emotion, and personality: The ergopsychometric approach. In Ch. Spielberger, I.G. Sarason, J. Strelau, & J.M.T. Brebner (Eds.), Stress and anxiety (pp. 23–52). New York: Hemisphere.Google Scholar
  13. Hoijtink, H., & Boomsma, A. (1996). Statistical inference based on latent ability estimates. Psychometrika, 61, 313–330.MATHCrossRefGoogle Scholar
  14. Holtzman, W.H. (1963). Statistical models for the study of change in the single case. In C.W. Harris (Ed.), Problems in measuring change (pp. 199–211). Madison, WI: The University of Wisconsin Press.Google Scholar
  15. Huber H. (1977). Zur Planung und Auswertung von Einzelfalluntersuchungen [On the planning and analysis of single case studies]. In L.J. Pongratz (Ed.), Handbuch der Psychologie: Vol. 8. Klinische Psychologie (pp. 1153–1199). Göttingen: Hogrefe.Google Scholar
  16. Klauer, K.C. (1991a). An exact and optimal standardized person test for assessing consistency with the Rasch model. Psychometrika, 56, 213–228.MathSciNetCrossRefGoogle Scholar
  17. Klauer, K.C. (1991b). Exact and best confidence intervals for the ability parameter of the Rasch Model. Psychometrika, 56, 535–547.MathSciNetMATHCrossRefGoogle Scholar
  18. Klauer, K.C. (1995). The assessment of person fit. In G.H. Fischer & I.W. Molenaar (Eds.), Rasch Models: Foundations, recent developments, and applications (pp. 97–110). New York: Springer-Verlag.Google Scholar
  19. Liou, M. (1993). Exact person tests for assessing model-data fit in the Rasch model. Applied Psychological Measurement, 17, 187–195.Google Scholar
  20. Liou, M., & Chang, C.-H. (1992). Constructing the exact significance level for a person fit statistic. Psychometrika, 47, 169–181.CrossRefGoogle Scholar
  21. Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.MATHCrossRefGoogle Scholar
  22. Meijer, R.R., & Sijtsma, K. (in press). A review of methods for evaluating the fit of item score patterns on a test. Applied Psychological Measurement.Google Scholar
  23. Mellenbergh, G.J., & Van den Brink, W.P. (1998). The measurement of individual change. Psychological Methods, 3, 470–485.CrossRefGoogle Scholar
  24. Molenaar, I.W., & Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55, 75–106.MathSciNetCrossRefGoogle Scholar
  25. Mood, A.M., Graybill, F.A., & Boes, D.C. (1974). Introduction to the theory of statistics. Singapore: McGraw-Hill.MATHGoogle Scholar
  26. Ponocny, I. (2000). Exact person fit indexes for the Rasch model for arbitrary alternatives. Psychometrika, 65, 29–42.MathSciNetCrossRefGoogle Scholar
  27. Ponocny, I., & Ponocny-Seliger, E. (1999). T-Rasch 1.0 [Software program]. Groningen: ProGAMMA.Google Scholar
  28. Prieler, J. (2000). Evaluation eines Ausleseverfahrens für Unteroffiziere, beim Österreichischen Bundesheer [Evaluation of a selection procedure for noncommissioned officers in the Austrian army]. Unpublished dissertation, University of Vienna, Department of Psychology.Google Scholar
  29. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: The Danish Institute of Educational Research. (Expanded edition, 1980. Chicago: University of Chicago Press.)Google Scholar
  30. Rasch, G. (1965). Målingsmodellerne og deres principielle baggrund [Models for measurements and their fundamental background]. (Notes taken by J. Stene at the statistical seminar.) Copenhagen: Department of Statistics, University of Copenhagen.Google Scholar
  31. Santner, T.J., & Duffy, D.E. (1989). The statistical analysis of discrete data. New York: Springer-Verlag.MATHCrossRefGoogle Scholar
  32. Willett, J.B. (1989). Some results on reliability for the longitudinal measurement of change: Implications for the design of studies of individual growth. Educational and Psychological Measurement, 49, 587–602.CrossRefGoogle Scholar
  33. Williams, R.H., & Zimmerman, D.W. (1996). Are simple gain scores obsolete? Applied Psychological Measurement, 20, 59–69.CrossRefGoogle Scholar
  34. Witting, H. (1985). Mathematische Statistik I [Mathematical statistics I]. Stuttgart: Teubner.MATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2001

Authors and Affiliations

  • Gerhard H. Fischer
    • 1
  1. 1.Department of PsychologyUniversity of ViennaWienAustria

Personalised recommendations