Abstract
To provide more evidence for validating a new scoring approach, Confidence Scoring, in speaking tests, this chapter reports a qualitative investigation of raters’ perceptions of using Confidence Scoring. The two approaches, Confidence Scoring and Traditional Scoring, were compared in terms of qualitative interview data of five raters to provide a fuller understanding of similarities and differences between the two approaches. The findings demonstrate that, on the one hand, similarities between the two approaches perceived by the interviewees indicated that Confidence Scoring is based on and develops from Traditional Scoring. On the other hand, differences between the two approaches identified by the interviewees revealed that Confidence Scoring subsumes Traditional Scoring but provides a more flexible way of acknowledging rater confidence in measuring candidate performance as well as contributing a more defensible way of employing confidence scores in quantifying candidate performance.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2008). Building a validity argument for the test of English as a foreign language. New York: Routledge.
Coniam, D. (2011). A qualitative examination of the attitudes of liberal studies markers towards onscreen marking in Hong Kong. British Journal of Educational Technology, 42(6), 1042–1054.
Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical methods for rates and proportions (3rd ed.). Hoboken: Wiley.
Fulcher, G. (2003). Testing second language speaking. London: Pearson Education Limited.
Jin, T. (2009). An investigation into fuzzy scoring methods for speaking tests. Master thesis, Shanghai Jiao Tong University, Shanghai; China Master Theses Full-text Database, Beijing.
Jin, T., & Mak, B. (2012). Confidence scoring [Audio podcast]. Language Testing Bytes, 8 (Podcasts for Language Testing).
Jin, T., & Mak, B. (2013). Distinguishing features in scoring L2 Chinese speaking performance: How do they work? Language Testing, 30(1), 23–47.
Jin, T., Wang, Y., Song, C., & Guo, S. (2008). An empirical study of fuzzy scoring methods for speaking tests. Modern Foreign Languages, 31(2), 157–164.
Jin, T., Mak, B., & Zhou, P. (2012). Confidence scoring of speaking performance: How does fuzziness become exact? Language Testing, 29(1), 43–65.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
Luoma, S. (2004). Assessing speaking. Cambridge: Cambridge University Press.
Richards, K. (2009). Interviews. In J. Heigham & R. A. Croker (Eds.), Qualitative research in applied linguistics: A practical introduction (pp. 182–199). Basingstoke: Palgrave Macmillan.
Taylor, L. (Ed.). (2011). Examining speaking: research and practice in assessing second language speaking. Cambridge: Cambridge University Press.
Taylor, L., & Falvey, P. (Eds.). (2007). IELTS collected papers: research in speaking and writing assessment. Cambridge: Cambridge University Press.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Jin, T. (2014). Putting Rater Confidence in Its Place: A Qualitative Investigation of Raters’ Perceptions on Using Confidence Scoring in Speaking Tests. In: Coniam, D. (eds) English Language Education and Assessment. Springer, Singapore. https://doi.org/10.1007/978-981-287-071-1_12
Download citation
DOI: https://doi.org/10.1007/978-981-287-071-1_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-070-4
Online ISBN: 978-981-287-071-1
eBook Packages: Humanities, Social Sciences and LawEducation (R0)