Abstract
The principal concern in the UK is with maintaining standards that already exist, rather than with setting a new standard. To ensure standards are kept ‘constant’ is essentially a process of comparison rather than measurement. In this chapter four examples are presented to show how Thurstone’s method of comparative judgement can be used to maintain standards, especially in the more ‘difficult’ cases involving extended writing, performances, or other complex activities. In particular, it describes how analysis of the residuals from fitting Rasch parameters to the data can be used to monitor the quality of the equating procedure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The weights used are the variances of each judgement, p*(1-p).
References
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 15(3), 297–334.
D’Arcy, J. (Ed.). (1997). Comparability studies between modular and non-modular syllabuses in GCE advanced level biology, English literature and mathematics in the 1996 summer examinations. Standing Committee on Research on behalf of the Joint Forum for the GCSE and GCE.
Kahneman, D. (2011). Thinking, fast and slow. New York/London: Allen Lane.
Kahneman, D., & Klein, G. (2009). Conditions for intuitive expertise: A failure to disagree. American Psychologist, 64, 515–526.
Klein, G. (2008). Naturalistic decision making. Human Factors, 50(3), 456–460.
Laming, D. (2004). Human judgment: The eye of the beholder. London: Thomson.
Linacre, M. J. (2010). A user’s guide to Facets. 3.67.1. Chicago: MESA Press.
Ofqual. (2014a). Setting standards for new GCSEs in 2017: Press release. https://www.gov.uk/government/news/setting-standards-for-new-gcses-in-2017. Accessed 10 Oct 2016.
Ofqual. (2014b) Guidance: Grade descriptors for GCSEs graded 9 to 1. thttps://www.gov.uk/government/publications/grade descriptors-for-gcses-graded-9-to-1. Accessed 10 Oct 2016.
Pollitt, A. (2004) Let’s stop marking exams. Paper presented at the annual conference of the International Association for Educational Assessment, Philadelphia, June 2004.
Pollitt, A. (2012a). Comparative Judgement for assessment. International Journal of Technology and Design Education, 22(2), 157–170. doi:10.1007/s10798-011-9189-x.
Pollitt, A. (2012b). The method of adaptive comparative judgment. Assessment in Education: Principles, Policy and Practice. doi:10.1080/0969594X.2012.665354.
Pollitt, A., & Murray, N.L. (1993). What raters really pay attention to. Language Testing Research Colloquium, Cambridge. Reprinted in M. Milanovic & N. Saville (Eds.), (1996), Studies in language testing 3: Performance testing, cognition and assessment. Cambridge: Cambridge University Press.
Thurstone, L.L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286. Chapter 3 in L.L. Thurstone (1959), The measurement of values. Chicago: University of Chicago Press.
Wordsworth, C. (1877). Scholae academicae. London: Frank Cass.
Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Pollitt, A. (2017). Using Professional Judgement To Equate Exam Standards. In: Blömeke, S., Gustafsson, JE. (eds) Standard Setting in Education. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-319-50856-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-50856-6_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50855-9
Online ISBN: 978-3-319-50856-6
eBook Packages: EducationEducation (R0)