An Option-Based Partial Credit Item Response Model

  • Yuanchao (Emily) BoEmail author
  • Charles Lewis
  • David V. Budescu
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 89)


Multiple-choice (MC) tests have been criticized for allowing guessing and the failure to credit partial knowledge, and alternative scoring methods and response formats (Ben-Simon et al., Appl Psychol Meas 21:65–88, 1997) have been proposed to address this problem. Modern test theory addresses these issues by using binary item response models (e.g., 3PL) with guessing parameters, or with polytomous IRT models. We propose an option-based partial credit IRT model and a new scoring rule based on a weighted Hamming distance between the option key and the option response vector. The test taker (TT)’s estimated ability is based on information from both correct options and distracters. These modifications reduce the TT’s ability to guess and credit the TT’s partial knowledge. The new model can be tailored to different formats, and some popular IRT models, such as the 2PL and Bock’s nominal model, are special cases of the proposed model. Markov Chain Monte Carlo (MCMC) analysis was used to estimate the model parameters and it provides satisfactory estimates of the model parameters. Simulation studies show that the weighted Hamming distance scores have the highest correlation with TTs’ true abilities, and their distribution is also less skewed than those of the other scores considered.


Item Response Theory MCMC Partial credit Partial knowledge Hamming distance Multiple Choice Scoring rule 


  1. Andersen EB (1977) Sufficient statistics and latent trait models. Psychometrika 42:69–81CrossRefzbMATHMathSciNetGoogle Scholar
  2. Andrich D (1988) Rasch models for measurement. Sage Publications, Beverly HillsGoogle Scholar
  3. Bechger TM, Maris G, Verstralen HHFM, Verhelst ND (2005) The Nedelsky model for multiple choice items. In: van der Ark LA, Croon MA, Sijtsma K (eds) New developments in categorical data analysis for the social and behavioral sciences. Erlbaum, Mahwah, pp 187–206Google Scholar
  4. Ben-Simon A, Budescu DV, Nevo B (1997) A comparative study of measures of partial knowledge in multiple-choice tests. Appl Psychol Meas 21:65–88CrossRefGoogle Scholar
  5. Bereby-Meyer Y, Meyer J, Budescu DV (2003) Decision making under internal uncertainty: the case of multiple-choice tests with different scoring rules. Acta Psychol 112:207–220CrossRefGoogle Scholar
  6. Birnbaum A (1968) Some latent trait models and their use in inferring an examinee’s ability. In: Lord FM, Novick MR (eds) Statistical theories of mental test scores. Addison-Wesley, ReadingGoogle Scholar
  7. Bock RD (1972) Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika 37:29–51CrossRefzbMATHMathSciNetGoogle Scholar
  8. Budescu DV, Bar-Hillel M (1993) To guess or not to guess: a decision theoretic view of formula scoring. J Educ Meas 30:227–291CrossRefGoogle Scholar
  9. Budescu DV, Bo Y (in press) Analyzing test-taking behavior: decision theory meets psychometric theory. PsychometrikaGoogle Scholar
  10. Coombs CH, Milholland JE, Womer FB (1956) The assessment of partial knowledge. Educ Psychol Meas 16:13–37Google Scholar
  11. R Development Core Team (2013) R: a language and environment for statistical computing [computer software]. R Foundation for Statistical Computing, Vienna. Retrieved from
  12. Dressel PL, Schmidt J (1953) Some modifications of the multiple choice item. Educ Psychol Meas 13:574–595CrossRefGoogle Scholar
  13. Echternacht GJ (1976) Reliability and validity of option weighting schemes. Educ Psychol Meas 36:301–309CrossRefGoogle Scholar
  14. Frary RB (1989) Partial-credit scoring methods for multiple-choice tests. Appl Meas Educ 2:79–96CrossRefGoogle Scholar
  15. Gibbons JD, Olkin I, Sobel M (1977) Selecting and ordering populations: a new statistical methodology. Wiley, New YorkzbMATHGoogle Scholar
  16. Gulliksen H (1950) Theory of mental tests. Wiley, New YorkCrossRefGoogle Scholar
  17. Haladyna TM, (1988) Empirically based polychromous scoring of multiple choice test items: A review. Paper presented at the annual meeting of the American Educational Research Association, New OrleansGoogle Scholar
  18. Hambleton RK, Roberts DM, Traub RE (1970) A comparison of the reliability and validity of two methods for assessing partial knowledge on a multiple-choice test. J Educ Meas 7:75–82CrossRefGoogle Scholar
  19. Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 29:147–160CrossRefMathSciNetGoogle Scholar
  20. Hansen R (1971) The influence of variables other than knowledge on probabilistic tests. J Educ Meas 8:9–14CrossRefGoogle Scholar
  21. Holzinger KJ (1924) On scoring multiple response tests. J Educ Psychol 15:445–447CrossRefGoogle Scholar
  22. Hutchinson TP (1982) Some theories of performance in multiple-choice tests, and their implications for variants of the task. Br J Math Stat Psychol 35:71–89CrossRefMathSciNetGoogle Scholar
  23. Jacobs SS (1971) Correlates of unwarranted confidence in responses to objective test items. J Educ Meas 8:15–19CrossRefGoogle Scholar
  24. Jaradat D, Tollefson N (1988) The impact of alternative scoring procedures for multiple-choice items on test reliability, validity and grading. Educ Psychol Meas 48:627–635CrossRefGoogle Scholar
  25. Kahneman D, Tversky A (1979) Prospect theory: an analysis of decisions under risk. Econometrica 47:313–327CrossRefGoogle Scholar
  26. Lunn DJ, Thomas A, Best N, Spiegelhalter D (2000) WinBUGS – a Bayesian modeling framework: concepts, structure, and extensibility. Stat Comput 10:325–337CrossRefGoogle Scholar
  27. Masters GN (1982) A Rasch model for partial credit scoring. Psychometrika 47:149–174CrossRefzbMATHGoogle Scholar
  28. Michael JC (1968) The reliability of a multiple choice examination under various test-taking instructions. J Educ Meas 5:307–314CrossRefGoogle Scholar
  29. Muraki E (1992) A generalized partial credit model: application of an EM algorithm. Appl Psychol Meas 16:159–176CrossRefGoogle Scholar
  30. Pugh RC, Brunza JJ (1975) Effects of a confidence weighted scoring system on measures of test reliability and validity. Educ Psychol Meas 35:73–78CrossRefGoogle Scholar
  31. Rippey RM (1970) A comparison of five different scoring functions for confidence tests. J Educ Meas 7:165–170CrossRefGoogle Scholar
  32. Ruch GM, Stoddard GD (1925) Comparative reliabilities of objective examinations. J Educ Psychol 16:89–103CrossRefGoogle Scholar
  33. Samejima F (1969) Estimation of ability using a response pattern of graded scores. Psychometrika Monograph, No. 18Google Scholar
  34. Samejima F (1972) A general model for free-response data. Psychometrika Monograph, No, 18.Google Scholar
  35. Samejima F (1979) A new family of models for the multiple choice item (Research Report No. 79-4). University of Tennessee, Department of Psychology, KnoxvilleGoogle Scholar
  36. San Martin E, del Pino G, de Boeck P (2006) IRT models for ability-based guessing. Appl Psychol Meas 30:183–203CrossRefMathSciNetGoogle Scholar
  37. Smith RM (1987) Assessing partial knowledge in vocabulary. J Educ Meas 24:217–231CrossRefGoogle Scholar
  38. Stanley JC, Wang MD (1970) Weighting test items and test item options, an overview of the analytical and empirical literature. Educ Psychol Meas 30:21–35CrossRefGoogle Scholar
  39. Swineford F (1938) Measurement of a personality trait. J Educ Psychol 29:295–300CrossRefGoogle Scholar
  40. Swineford F (1941) Analysis of a personality trait. J Educ Psychol 32:348–444CrossRefGoogle Scholar
  41. Sykes RC, Hou L (2003) Weighting constructed-response items in IRT-based exams. Appl Meas Educ 16:257–275CrossRefGoogle Scholar
  42. Thissen D, Steinberg L (1984) A response model for multiple choice items. Psychometrika 49:501–519CrossRefGoogle Scholar
  43. Thurstone LL (1919) A method for scoring tests. Psychol Bull 16:235–240CrossRefGoogle Scholar
  44. Tversky A, Kahneman D (1992) Advances in prospect theory: cumulative representation of uncertainty. J Risk Uncertainty 5:297–323CrossRefzbMATHGoogle Scholar
  45. Wang MW, Stanley JC (1970) Differential weighting: a review of methods and empirical studies. Rev Educ Res 40:663–705CrossRefGoogle Scholar
  46. Yaniv I, Schul Y (1997) Elimination and inclusion procedures in judgment. J Behav Decis Mak 10:211–220CrossRefGoogle Scholar
  47. Yaniv I, Schul Y (2000) Acceptance and elimination procedure in choice: noncomplementarity and the role of implied status quo. Organ Behav Hum Decis Process 82:293–313CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Yuanchao (Emily) Bo
    • 1
    Email author
  • Charles Lewis
    • 1
  • David V. Budescu
    • 1
  1. 1.Department of PsychologyFordham UniversityBronxUSA

Personalised recommendations