Abstract
Background and aims: Despite the importance of enhancing and assessing oral interactive ability, few studies have investigated paired oral assessment for Japanese learners of English. This study refines Koizumi et al. (in press), expands the number of paired oral tasks that are calibrated on a logit scale, and examines aspects of validity mainly related to paired oral tasks and raters. Methods: A total of 190 Japanese students from three universities participated in 11 paired oral tasks. Their responses were recorded and evaluated by three raters using a holistic scale. A multifaceted Rasch measurement program, Facets (Linacre 2014), was used. The rating scale model was used to examine the test takers’ abilities, task difficulty, rater severity, and rating scale functions. Structural equation modeling and generalizability theory were also employed. Results and discussions: Results showed a unitary factor structure in the test, with some error correlations between tasks and between raters. We also found that all tasks and raters fit the Rasch model, the rating scale functioned properly, and there was a relatively wide range of tasks in terms of difficulty levels but also a lack of tasks at the higher and lower ends and in between. Results also suggested that large percentages of score variance were explained by persons (test takers), interactions between persons and tasks and between persons and raters, and residuals, and that four tasks with two raters or three tasks with three raters are needed to gain a sufficient reliability of φ = 0.70. As a result, we could provide pieces of validity evidence for the interpretation of the paired oral tasks developed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aryadousta, V. (2016). Gender and academic major bias in peer assessment of oral presentations. Language Assessment Quarterly, 13, 1–24. doi:10.1080/15434303.2015.1133626.
Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Brennan, R. L. (2001). Generalizability theory. New York: Springer.
Butler, Y. G., & Zeng, W. (2014). Young foreign language learners’ interactions during task-based paired assessment. Language Assessment Quarterly, 11, 45–75. doi:10.1080/15434303.2013.869814.
Byrne, B. M. (2012). Structural equation modeling with Mplus: Basic concepts, applications, and programming. New York: Routledge.
Cambridge ESOL Examinations (2010). Speaking test preparation pack for Key English Test. Cambridge, U.K.: Author.
Center for Advanced Studies in Measurement and Assessment (University of Iowa, College of Education). (2013). GENOVA suite programs. Retrieved from http://www.education.uiowa.edu/centers/casma/computer-programs#8f748e48-f88c-6551-b2b8-ff00000648cd.
Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2008). Building a validity argument for the test of English as a foreign language.™. New York, NY: Routledge.
Davis, L. (2009). The influence of interlocutor proficiency in a paired oral assessment. Language Testing, 26, 367–396. doi:10.1177/0265532209104667.
Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33, 117–135. doi:10.1177/0265532215582282.
Edwards, L. (2008). Common European Framework assessment tests. London, U.K.: Mary Glasgow Magazines (Scholastic).
Galaczi, E. D. (2008). Peer-peer interaction in a speaking test: The case of the First Certificate in English examination. Language Assessment Quarterly, 5, 89–119. doi:10.1080/15434300801934702.
Galaczi, E. D. (2014). Interactional competence across proficiency levels: How do learners manage interaction in paired speaking tests? Applied Linguistics, 35, 553–574. doi:10.1093/applin/amt017.
Galaczi, E., & ffrench, A. (2011). Context validity. In L. Taylor (Ed.), Examining speaking: Research and practice in assessing second language speaking (pp. 112–170). Cambridge, UK: Cambridge University Press.
Kley, K. (2015). Interactional competence in paired speaking tests: Role of paired task and test-taker speaking ability in co-constructed discourse. Unpublished Ph.D. dissertation, University of Iowa, U.S. Retrieved from http://ir.uiowa.edu/etd/1663/.
Kline, R. B. (2010). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford Press.
Koizumi, R., In’nami, Y., & Fukazawa, M. (in press). Development of a paired oral test for Japanese university students. British Council New Directions in Language Assessment: JASELE Journal Special Edition.
Lin, C.-K. (2014). Treating either ratings or raters as a random facet in a performance-based language assessments: Does it matter? CaMLA Working Papers 2014-01. Cambridge Michigan Language Assessments. Retrieved from http://www.cambridgemichigan.org/sites/default/files/resources/workingpapers/CWP-2014-01.pdf.
Linacre, J. M. (2013). A user’s guide to FACETS: Rasch-model computer programs (Program manual 3.71.0). Retrieved from http://www.winsteps.com/a/facets-manual.pdf.
Linacre, J. M. (2014). Facets: Rasch-measurement computer program (Version 3.71.4) [Computer software]. Chicago: MESA Press.
McNamara, T., & Knoch, U. (2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language Testing, 29, 555–576. doi:10.1177/0265532211430367.
Messick, S. (1996). Validity and washback in language testing. Language Testing, 13, 241–256. doi:10.1177/026553229601300302.
Muthén, L., & Muthén, B. (2014). Mplus (Version 7.2) [Computer software]. Los Angeles, CA: Muthén & Muthén.
Negishi, J. (2015). Effects of test types and interlocutors’ proficiency on oral performance assessment. Annual Review of English Language Education in Japan, 26, 333–348.
Ockey, G. J., & Choi, I. (2015). Structural equation modeling reporting practices for language assessment. Language Assessment Quarterly, 12, 305–319. doi:10.1080/15434303.2015.1050101.
Taylor, L., & Wigglesworth, G. (2009). Are two heads better than one? Pair work in L2 assessment contexts. Language Testing, 26, 325–339. doi:10.1177/0265532209104665.
Van Moere, A. (2006). Validity evidence in a university group oral test. Language Testing, 23, 411–440. doi:10.1191/0265532206lt336oa.
Wang, J., & Wang, X. (2012). Structural equation modeling: Applications using Mplus. West Sussex, UK: Wiley.
Acknowledgement
This work was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI, Grant-in-Aid for Scientific Research (C), Grant Number 26370737.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media Singapore
About this paper
Cite this paper
Koizumi, R., In’nami, Y., Fukazawa, M. (2016). Multifaceted Rasch Analysis of Paired Oral Tasks for Japanese Learners of English. In: Zhang, Q. (eds) Pacific Rim Objective Measurement Symposium (PROMS) 2015 Conference Proceedings. Springer, Singapore. https://doi.org/10.1007/978-981-10-1687-5_6
Download citation
DOI: https://doi.org/10.1007/978-981-10-1687-5_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1686-8
Online ISBN: 978-981-10-1687-5
eBook Packages: Behavioral Science and PsychologyBehavioral Science and Psychology (R0)