Test Calibration

  • Frank B. Baker
  • Seock-Ho Kim
Part of the Statistics for Social and Behavioral Sciences book series (SSBS)


For didactic purposes, all of the preceding chapters have assumed that the metric of the ability scale was known. This metric had a midpoint of zero, a unit of measurement of 1, and a range from negative infinity to positive infinity. The numerical values of the item parameters and the examinee’s ability parameters have been expressed in this metric. While this has served to introduce you to the fundamental concepts of item response theory , it does not represent the actual testing situation. When test constructors write an item, they know what trait they want the item to measure and whether the item is designed to function among low-, medium-, or high-ability examinees. But it is not possible to determine the values of the item’s parameters a priori. In addition, when a test is administered to a group of examinees, it is not known in advance how much of the latent trait each of the examinees possesses. As a result, a major task is to determine the values of the item parameters and examinee abilities in a metric for the underlying latent trait. In item response theory, this task is called test calibration and it provides a frame of reference for interpreting test results. Test calibration is accomplished by administering a test to a group of N examinees and dichotomously scoring the examinees’ responses to the J items. Then mathematical procedures are applied to the item response data in order to create an ability scale that is unique to the particular combination of test items and examinees. The values of the item parameter estimates and the examinees’ estimated abilities are expressed in this metric. Once this is accomplished, the test has been calibrated and the test results can be interpreted via the constructs of item response theory.


Item Response Theory Latent Trait Item Difficulty Item Parameter Test Calibration 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Baker, F. B. (1992). Item response theory: Parameter estimation techniques. New York, NY: Dekker.Google Scholar
  2. Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). New York, NY: Dekker.Google Scholar
  3. Linacre, J. M. (2015). A user’s guide to WINSTEPS MINISTEP Rasch-model computer programs. Chicago, IL: Scholar
  4. Wingersky, M. S., Barton, M. A., & Lord, F. M. (1982). LOGIST user’s guide. LOGIST 5, version 1.0. Princeton, NJ: Educational Testing Service.Google Scholar
  5. Wingersky, M. S., Patrick, R., & Lord, F. M. (1999). LOGIST user’s guide (version 7.1). Princeton, NJ: Educational Testing Service.Google Scholar
  6. Wright, B. D., & Mead, R. J. (1976). BICAL: Calibrating items with the Rasch model. (Research Memorandum No. 23). Chicago, IL: Statistical Laboratory, Department of Education, University of Chicago.Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Frank B. Baker
    • 1
  • Seock-Ho Kim
    • 2
  1. 1.Educational PsychologyUniversity of Wisconsin-MadisonMadisonUSA
  2. 2.Educational PsychologyUniversity of GeorgiaAthensUSA

Personalised recommendations