The Sixth Validation Study: Assessing the Ease of Use in the Environment and Markers’ Acceptance of Onscreen Marking in Hong Kong in Three Subject Areas: A Rasch Measurement Perspective

  • David ConiamEmail author
  • Peter Falvey
  • Zi Yan


During the series of validation studies investigating the implementation of onscreen marking in Hong Kong’s public examinations, an issue arose. The issue was that some markers were not as positive as might have been expected – given the English Language markers’ reactions to the enhanced support from the system. Concern was expressed about the feedback and support provided to markers on the accuracy of their marking. It was considered that this was an area that should be investigated with a view to possibly enhancing the amount and type of feedback provided to markers. This chapter therefore extends the investigation into OSM (see accounts of previous studies in this volume) into two areas: ease of use in the environment; and markers’ acceptance of OSM in the Hong Kong public examination context. In contrast to previous studies in this volume where there was a single subject area focus, this study took a heterogeneous approach. The sample forming the database for the investigation contained scripts from three subjects (English Language, Chinese Language, and Liberal Studies). Scripts comprised essays and short answer questions, as well as scripts written in either English or Chinese. Two scales assessing the ease of use and markers’ acceptance of OSM were investigated from a Rasch measurement perspective (a more sophisticated mode of measurement compared with classical test theory as described in Chap.  3 and below); with both scales showing good psychometric properties. The findings revealed that markers generally had a high level of perceived ease of use in the environment and the overall acceptance of OSM was positive. Differences of person measures across language, question type, and subject were compared and implications of the two scales for future validation research studies are briefly discussed.


  1. Andrich, D. (2004). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42(1), 1–7.CrossRefGoogle Scholar
  2. Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah: Erlbaum.Google Scholar
  3. Bos, W., Goy, M., Howie, S. J., Kupari, P., & Wendt, H. (2011). Editorial: Rasch measurement in educational contexts Special issue 2: Applications of Rasch measurement in large-scale assessments. Educational Research and Evaluation, 17, 413–417.CrossRefGoogle Scholar
  4. Chan, A. H. S., So, J. C. Y., & Tsang, S. N. H. (2011). Developing optimum interface design for on-screen Chinese proofreading tasks. In Proceedings of the 1st international conference on Human interface 2011. Held as Part of HCI International 2011, Orlando, FL, USA, 9–14 July 2011, Proceedings, Part II. Google Scholar
  5. Chen, C.-H., & Chien, Y.-H. (2005). Effect of dynamic display and speed of display movement on reading Chinese text presented on a small screen. Perceptual and Motor Skills, 100, 865–873.CrossRefGoogle Scholar
  6. Coniam, D. (2009a). A comparison of onscreen and paper-based marking in the Hong Kong public examination system. Educational Research and Evaluation, 15(3), 243–263.CrossRefGoogle Scholar
  7. Coniam, D. (2009b). Discrepancy essays: Natural phenomenon or problem to be solved? Melbourne Papers in Language Testing, 14(2), 1–31.Google Scholar
  8. Coniam, D. (2011). A qualitative examination of the attitudes of liberal studies markers towards onscreen marking. British Journal of Educational Technology, 42(6), 1042–1054.CrossRefGoogle Scholar
  9. Coniam, D. (2013). The increasing acceptance of onscreen marking – the ‘tablet computer’ effect. Journal of Educational Technology & Society, 16(3), 119–129.Google Scholar
  10. Coniam, D., & Yeung, A. (2010). Markers’ perceptions regarding the onscreen marking of liberal studies in the Hong Kong public examination system. Asia Pacific Journal of Education, 30(3), 249–271.CrossRefGoogle Scholar
  11. Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3), 319–339.CrossRefGoogle Scholar
  12. Fowles, D., & Adams, C. (2005, September). How does assessment differ when e-marking replaces paper-based marking? Paper presented at the 31st International Association for Educational Assessment Conference, Abuja, Nigeria.Google Scholar
  13. Hart, D. L., & Wright, B. D. (2002). Development of an index of physical functional health status in rehabilitation. Archives of Physical Medicine and Rehabilitation, 83, 655–665.CrossRefGoogle Scholar
  14. Johnson, M., Nádas, R., & Green, S. (2010). Marking essays on screen and on paper. Education Journal, 121, 39–41.Google Scholar
  15. Linacre, J. M. (2002). Optimizing rating scale category effectiveness. Journal of Applied Measurement, 3(1), 85–106.Google Scholar
  16. Linacre, J. M. (2006). A user’s guide to WINSTEPS/MINISTEP: Rasch-model computer programs. Chicago: Scholar
  17. Linacre, J. M. (2011). WINSTEPS: Rasch measurement computer program. Chicago: Scholar
  18. Muís, K. R., Winne, P. H., & Edwards, O. V. (2009). Modern psychometrics for assessing achievement goal orientation: A Rasch analysis. British Journal of Educational Psychology, 79, 547–576.CrossRefGoogle Scholar
  19. Powers, D., Kubota, M., Bentley, J., Farnum, M., Swartz, R., & Willard, A. (1997). A pilot test of on-line essay scoring. ETS Report RM-97-07.Google Scholar
  20. Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. (Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword and afterword by B.D. Wright. Chicago: University of Chicago Press.Google Scholar
  21. Schulz, W., & Fraillon, J. (2011). The analysis of measurement equivalence in international studies.Google Scholar
  22. Shaw, S. (2008). Essay marking on-screen: Implications for assessment validity. E-Learning and Digital Media, 5(3), 256–274. Retrieved January 11, 2011, from Scholar
  23. Shaw, S., & Imam, H. (2008, September). On-screen essay marking reliability: Towards an understanding of marker assessment behaviour. Paper presented at the IAEA 2008 conference, Cambridge, UK.Google Scholar
  24. Törmäkangas, K. (2011). Advantages of the Rasch measurement model in analysing educational tests: An applicator’s reflection. Educational Research and Evaluation, 17, 307–320.CrossRefGoogle Scholar
  25. Twing, J., Nichols, P., & Harrison, I. (2003). The comparability of paper-based and image-based marking of a high-stakes, large-scale writing assessment. International Association for Educational Assessment Conference, Manchester, 5–10 Oct 2003.Google Scholar
  26. Verhelst, N. D., & Glas, C. A. (1995). The one parameter logistic model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Foundations, recent developments, and applications (pp. 215–237). New York: Springer.CrossRefGoogle Scholar
  27. Wang, W. C. (2000). Modeling effects of differential item functioning in polytomous items. Journal of Applied Measurement, 1, 63–82.Google Scholar
  28. Wendt, H., Bos, W., & Goy, M. (2011). On applications of Rasch models in international comparative large-scale assessments: A historical review. Educational Research and Evaluation, 17, 419–446.CrossRefGoogle Scholar
  29. Wright, B. D. (1992). IRT in the 1990s: Which models work best? Rasch Measurement Transactions, 6, 196–200.Google Scholar
  30. Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33–45.CrossRefGoogle Scholar
  31. Yan, Z., & Bond, T. G. (2011). Developing a Rasch measurement physical fitness scale for Hong Kong primary school-aged students. Measurement in Physical Education and Exercise Science, 15, 182–203.CrossRefGoogle Scholar
  32. Yan, Z., & Coniam, D. (2013). Assessing the ease of use in the environment and markers’ acceptance of onscreen marking: A Rasch measurement perspective. Educational Research and Evaluation, 19(5), 461–483.CrossRefGoogle Scholar
  33. Yen, N.-S., Tsai, J.-L., Chen, P.-L., Lin, H.-Y., & Chen, A. L. P. (2011). Effects of typographic variables on eye-movement measures in reading Chinese from a screen. Behaviour & Information Technology, 30, 797–808.CrossRefGoogle Scholar
  34. Zhang, Y., Powers, D., Wright, W., & Morgan, R. (2003). Applying the online scoring network (OSN) to advanced program placement program (AP) Tests. ETS Research Report RR-03-12. Google Scholar
  35. Zumbo, B. D. (2007). Three generations of DIF analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4, 223–233.CrossRefGoogle Scholar
  36. Zwick, R., Thayer, D. T., & Lewis, C. (1999). An empirical Bayes approach to Mantel-Haenszel DIF. Retrieved on November 8, 2015, from:

Copyright information

© Springer Science+Business Media Singapore 2016

Authors and Affiliations

  1. 1.Department of Curriculum and InstructionThe Education University of Hong KongTai PoHong Kong

Personalised recommendations