High Speed High Stakes Scoring Rule

Assessing the Performance of a New Scoring Rule for Digital Assessment
  • Sharon Klinkenberg
Part of the Communications in Computer and Information Science book series (CCIS, volume 439)


In this paper we will present the results of a three year subsidized research project investigating the performance of a new scoring rule for digital assessment. The scoring rule incorporates response time and accuracy in an adaptive environment. The project aimed to assess the validity and reliability of the ability estimations generated with the new scoring rule. It was also assessed whether the scoring rule was vulnerable for individual differences. Results show a strong validity and reliability in several studies within different domains: e.g. math, statistics and chess. We found no individual differences in the performance of the HSHS scoring rule for risk taking behavior and performance anxiety, nor did we find any performance differences for gender.


“computer adaptive testing” “speed accuracy trade-off” “scoring rule” “digital assessment” validity reliability CAT DIF 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Barkhof, J., Bekker, T., Bersma, M., Groenendijk, E., Maza, S.: Oefening Baart Kunst (unpublished research report). University of Amsterdam, Netherlands (2013)Google Scholar
  2. 2.
    Bogacz, R., Brown, E., Moehlis, J., Holmes, P., Cohen, J.C.: The Physics of Optimal Decision Making: A Formal Analysis of Models of Performance in Two-Alternative Forced-Choice Tasks. Psychological Review 113(4), 700–765 (2006)CrossRefGoogle Scholar
  3. 3.
    Budescu, D., Bar-Hillel, M.: To Guess or Not to Guess: A Decision-Theoretic View of Formula Scoring. Journal of Educational Measurement 30(4), 277–291 (1993)CrossRefGoogle Scholar
  4. 4.
    Burton, R.F.: Multiple-choice and true/false tests: myths and misapprehensions. Assessment & Evaluation in Higher Education 30(1), 65–72 (2005)CrossRefGoogle Scholar
  5. 5.
    Eggen, T.J.H.M., Verschoor, A.J.: Optimal Testing with Easy or Difficult Items in Computerized Adaptive Testing. Applied Psychological Measurement 30(5), 379–393 (2006)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Holzinger, K.J.: On Scoring Multiple Response Tests. Journal of Educational Measurement 15, 445–447 (1924)Google Scholar
  7. 7.
    Jansen, B.R.J., Louwerse, J., Straatemeier, M., Van der Ven, S.H.G., Klinkenberg, S., Van der Maas, H.L.J.: The influence of practicing maths with a computer-adaptive program on math anxiety, perceived math competence, and math performance. Learning and Individual Differences 24, 190–197 (2013)CrossRefGoogle Scholar
  8. 8.
    Klinkenberg, S., Straatemeier, M., Van der Maas, H.L.J.: Computer adaptive practice of maths ability using a new item response model for on the fly ability and difficulty estimation. Comput. Educ. 57(2), 1813–1824 (2011)CrossRefGoogle Scholar
  9. 9.
    Klinkenberg, S., Van der Maas, H.L.J.: A dynamic paired comparison based computer adaptive testing method. Unpublished manuscript (2013)Google Scholar
  10. 10.
    Lord, F.M.: Formula Scoring and Number-right Scoring. Journal of Educational Measurement 12(1), 7–11 (1975)CrossRefGoogle Scholar
  11. 11.
    Maris, G., Van der Maas, H.L.J.: Speed-accuracy response models: scoring rules based on response time and accuracy. Psychometrika 77(4), 615–633 (2012)CrossRefzbMATHMathSciNetGoogle Scholar
  12. 12.
    Özen, S., Pronk, A., Sanchez Maceiras, S., Stel, N., Van Wersch, T.: De Invloed van de HSHS scoreregel op het Meten van Werkelijke Vaardigheid (Unpublished research report). University of Amsterdam, Netherlands (2012)Google Scholar
  13. 13.
    Ratcliff, R.: A theory of memory retrieval. Psychological Review 85(2), 59–108 (1978)CrossRefGoogle Scholar
  14. 14.
    Ratcliff, R., Rouder, J.N.: Modeling Response Times for Two-Choice Decisions. Psychological Science 9(5), 347–356 (1998)CrossRefGoogle Scholar
  15. 15.
    Thurstone, L.L.: A method for scoring tests. Psychological Bulletin 16, 235–240 (1919)CrossRefGoogle Scholar
  16. 16.
    van der Linden, W.J.: A hierarchical framework for modeling speed and accuracy on test items. Psychometrika 72, 287–308 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Van der Linden, W.J., Hambleton, R.K. (eds.): Handbook of modern item response theory. Springer, New York (1997)zbMATHGoogle Scholar
  18. 18.
    Van der Maas, H.L.J., Wagenmakers, E.J.: The Amsterdam Chess Test: a psychometric analysis of chess expertise. American Journal of Psychology 118, 29–60 (2005)Google Scholar
  19. 19.
    Vandekerckhove, J., Tuerlinckx, F.: Fitting the Ratcliff diffusion model to experimental data. Psychonomic Bulletin & Review 14, 1011–1026 (2007)CrossRefGoogle Scholar
  20. 20.
    Wickelgren, W.A.: Speed-accuracy tradeoff and information processing dynamics. Acta Psychologica 41, 67–85 (1977)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sharon Klinkenberg
    • 1
  1. 1.Faculty of Social and Behavioural Sciences, Department of PsychologyUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations