Reliability of essay rating

  • Nicholas T. Longford
Part of the Springer Series in Statistics book series (SSS)


Standardized educational tests have until recently been associated almost exclusively with multiple-choice items. In such a test, the examinee is presented items comprising a reading passage stating the question or describing the problem to be solved, followed by a set of response options. One of the options is the correct answer; the others are incorrect. The examinee’s task is to identify the correct response. With such an item format, examinees can be given a large number of items in a relatively short time, say, 40 items in a half-hour test. Scoring the test, that is, recording the correctness of each response, can be done reliably by machines at a moderate cost. A serious criticism of this item format is the limited variety of items that can be administered and that certain aspects of skills and abilities cannot be tested by such items. Many items can be solved more effectively by eliminating all the incorrect response options than by deriving the correct response directly. Certainly, problems that can be formulated as multiple-choice items are much rarer in real life; in many respects, it would be preferable to use items with realistic problems that require the examinees to construct their responses.


Mean Square Error True Score Severity Variance Advance Placement Rater Grade 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag New York, Inc. 1995

Authors and Affiliations

  • Nicholas T. Longford
    • 1
  1. 1.Research DivisionEducational Testing ServicePrincetonUSA

Personalised recommendations