Predicting Human Computation Game Scores with Player Rating Systems
Human computation games aim to apply human skill toward real-world problems through gameplay. Such games may suffer from poor retention, potentially due to the constraints that using pre-existing problems place on game design. Previous work has proposed using player rating systems and matchmaking to balance the difficulty of human computation games, and explored the use of rating systems to predict the outcomes of player attempts at levels. However, these predictions were win/loss, which required setting a score threshold to determine if a player won or lost. This may be undesirable in human computation games, where what scores are possible may be unknown. In this work, we examined the use of rating systems for predicting scores, rather than win/loss, of player attempts at levels. We found that, except in cases with a narrow range of scores and little prior information on player performance, Glicko-2 performs favorably to alternative methods.
KeywordsHuman computation games Player rating systems Prediction Elo Glicko-2
This work was supported by a Northeastern University TIER 1 grant. This material is based upon work supported by the National Science Foundation under Grant No. 1652537. We would like to thank the University of Washington’s Center for Game Science for initial Paradox development.
- 1.von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 319–326 (2004)Google Scholar
- 3.Cooper, S., Deterding, S., Tsapakos, T.: Player rating systems for balancing human computation games: testing the effect of bipartiteness. In: Proceedings of the 1st International Joint Conference of DiGRA and FDG (2016)Google Scholar
- 5.Csikszentmihalyi, M.: Flow: The Psychology of Optimal Experience. Harper and Row, New York (1990)Google Scholar
- 6.Dean, D., Gaurino, S., Eusebi, L., Keplinger, A., Pavlik, T., Watro, R., Cammarata, A., Murray, J., McLaughlin, K., Cheng, J., Maddern, T.: Lessons learned in game development for crowdsourced software formal verification. In: Proceedings of the 2015 USENIX Summit on Gaming, Games, and Gamification in Security Education (2015)Google Scholar
- 7.Elo, A.E.: The Rating of Chessplayers, Past and Present. Arco, New York (1978)Google Scholar
- 10.Herbrich, R., Minka, T., Graepel, T.: TrueSkill(TM): a Bayesian skill rating system. Adv. Neural Inf. Process. Syst. 20, 569–576 (2007)Google Scholar
- 11.Kirkman, R.: Pyglicko2: a Python Implementation of the Glicko-2 algorithm (2010). https://code.google.com/p/pyglicko2/
- 12.Law, E., Von Ahn, L.: Human Computation. Morgan & Claypool, San Rafael (2011)Google Scholar
- 13.Pe-Than, E.P.P., Goh, D.H.L., Lee, C.S.: A survey and typology of human computation games. In: Proceedings of the 9th International Conference on Information Technology: New Generations, pp. 720–725 (2012)Google Scholar