Abstract
The project “Modeling competencies with multidimensional item-response-theory models” examined different psychometric models for student performance in English as a foreign language. On the basis of the results of re-analyses of data from completed large scale assessments, a new test of reading and listening comprehension was constructed. The items within this test use the same text material both for reading and for listening tasks, thus allowing a closer examination of the relations between abilities required for the comprehension of both written and spoken texts. Furthermore, item characteristics (e.g., cognitive demands and response format) were systematically varied, allowing us to disentangle the effects of these characteristics on item difficulty and dimensional structure. This chapter presents results on the properties of the newly developed test: Both reading and listening comprehension can be reliably measured (rel = .91 for reading and .86 for listening). Abilities for both sub-domains prove to be highly correlated yet empirically distinguishable, with a latent correlation of .84. Despite the listening items being more difficult, in terms of absolute correct answers, the difficulties of the same items in the reading and listening versions are highly correlated (r = .84). Implications of the results for measuring language competencies in educational contexts are discussed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Alderson, J. C., Figueras, N., Kuijper, H., Nold, G., Takala, S., & Tardieu, C. (2006). Analysing tests of reading and listening in relation to the common European framework of reference: The experience of the Dutch CEFR construct project. Language Assessment Quarterly, 3, 3–30. doi:10.1207/s15434311laq0301_2.
Beck, B., & Klieme, E. (Eds.). (2007). Sprachliche Kompetenzen, Konzepte und Messung: DESI-Studie (Deutsch Englisch Schülerleistungen International). [Language competencies, concepts and measurements: DESI-Study (German English student performance international)]. Weinheim: Beltz.
Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal of Psychology, 3, 296–322. doi:10.1111/j.2044-8295.1910.tb00207.x.
Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6). Retrieved from http://www.jstatsoft.org/article/view/v048i06/v48i06.pdf
Chen, W. H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265–289. doi:10.2307/1165285.
Council of Europe. (2001). The common European framework of reference for languages: Learning, teaching, assessment. Cambridge, UK: Cambridge University Press.
DESI-Konsortium (Ed.). (2008). Unterricht und Kompetenzerwerb in Deutsch und Englisch. [Instruction and competence development in the school subjects German and English]. Weinheim: Beltz.
Embretson, S. E. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93, 179–197. doi:10.1037/0033-2909.93.1.179.
Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380–396. doi:10.1037/1082-989X.3.3.380.
Freedle, R., & Kostin, I. (1993). The prediction of TOEFL reading item difficulty: Implications for construct validity. Language Testing, 10, 133–170. doi:10.1177/026553229301000203.
Gorin, J., Embretson, S., Sheehan, K. (2002, April). Cognitive and psychometric modeling of text-based reading comprehension GRE-V items. Paper presented at the meeting of NCME, New Orleans.
Grotjahn, R. (2000). Determinanten der Schwierigkeit von Leseverstehensaufgaben. [Difficulty determining characteristics of reading tests]. In S. Bolton (Ed.), TESTDAF: Grundlagen für die Entwicklung eines neuen Sprachtests (pp. 7–56). München: Goethe-Institut.
Harsch, C., & Hartig, J. (2011). Modellbasierte Definition von fremdsprachlichen Kompetenzniveaus am Beispiel der Bildungsstandards Englisch [Model-based definitions of competence levels in the case of the German educational standards]. Zeitschrift für Interkulturelle Fremdsprachenforschung, 16, 6–17.
Harsch, C., & Hartig, J. (2015). What are we aligning tests to when we report test alignment to the CEFR? Language Assessment Quarterly, 12, 333–362. doi:10.1080/15434303.2015.1092545.
Harsch, C., Pant, H. A., & Köller, O. (Eds.). (2010). Calibrating standards-based assessment tasks for English as a first foreign language: Standard-setting procedures in Germany. Münster: Waxmann.
Hartig, J., & Frey, A. (2012). Konstruktvalidierung und Skalenbeschreibung in der Kompetenzdiagnostik durch die Vorhersage von Aufgabenschwierigkeiten [Using the prediction of item difficulties for construct validation and model-based proficiency scaling]. Psychologische Rundschau, 63, 43–49. doi:0.1026/0033-3042/a000109.
Hartig, J., Harsch, C., Höhler, J. (2009, July). Explanatory models for item difficulties in reading and listening comprehension. Paper presented at the international meeting of the IMPS, Cambridge, UK.
Hartig, J., Frey, A., Nold, G., & Klieme, E. (2012). An application of explanatory item response modeling for model-based proficiency scaling. Educational and Psychological Measurement, 72, 665–686. doi:10.1177/0013164411430707.
Hartmann, C. (2008). Schwierigkeitserklärende Merkmale von Englisch-Leseverstehensaufgaben [The difficulty-determining characteristics of reading tests]. (Unpublished diploma thesis.) Humboldt-University Berlin, Berlin.
Höhler, J. (2012). Niveau- und Strukturmodelle zur Darstellung von Schülerkompetenzen [Level and structural models for students’ competencies]. (Doctoral dissertation, Goethe University Frankfurt, Frankfurt.)
Khalifa, H., & Weir, C. J. (2009). Examining Reading. Cambridge, UK: Cambridge University Press.
Kiefer, T., Robitzsch, A., Wu, M. (2014). TAM: Test Analysis Modules. R package version 1.0-2. Retrieved from http://CRAN.R-project.org/package=TAM.
Leucht, M., Harsch, C., Pant, H. A., & Köller, O. (2012). Steuerung zukünftiger Aufgabenentwicklung durch Vorhersage der Schwierigkeiten eines Tests für die erste Fremdsprache Englisch durch Dutch Grid Merkmale [Guiding future task development for tests for English as a foreign language by predicting item difficulty employing the Dutch Grid]. Diagnostica, 58, 31–44. doi:10.1026/0012-1924/a000063.
Lumley, T., Routitsky, A., Mendelovits, J., Ramalingam, D. (2012). A framework for predicting item difficulty in reading tests. Retrieved from http://research.acer.edu.au/pisa/5
Muthén, L. K., & Muthén, B. O. (2012). Mplus User’s Guide (7 ed.). Los Angeles: Author.
Nold, G., & Rossa, H. (2007). Hörverstehen. Leseverstehen [Listening and reading comprehension]. In B. Beck & E. Klieme (Eds.), Sprachliche Kompetenzen: Konzepte und Messung. DESI-Ergebnisse Band 1 (pp. 178–211). Weinheim: Beltz.
Nold, G., & Rossa, H. (2008). Hörverstehen Englisch. [Listening comprehension in English]. In DESI-Konsortium (Ed.), Unterricht und Kompetenzerwerb in Deutsch und Englisch: Ergebnisse der DESI-Studie (pp. 120–129). Weinheim: Beltz.
Nold, G., Rossa, H., & Chatzivassiliadou, K. (2008). Leseverstehen Englisch. [Reading comprehension in English]. In DESI-Konsortium (Ed.), Unterricht und Kompetenzerwerb in Deutsch und Englisch: Ergebnisse der DESI-Studie (pp. 130–138). Weinheim: Beltz.
R Development Core Team. (2014). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org/
Rauch, D., & Hartig, J. (2010). Multiple-choice versus open-ended response formats of reading test items: A two-dimensional IRT analysis. Psychological Test and Assessment Modeling, 52, 354–379.
Rupp, A. A., Vock, M., Harsch, C., & Köller, O. (2008). Developing standards-based assessment tasks for English as a first foreign language: Context, processes, and outcomes in Germany. Münster: Waxmann.
Sarig, G. (1989). Testing meaning construction: can we do it fairly? Language Testing, 6, 77–94. doi:10.1177/026553228900600107.
Shohamy, E. (1984). Does the testing method make a difference? The case of reading comprehension. Language Testing, 1, 147–170. doi:10.1177/026553228400100203.
Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin, 86, 420–428. doi:10.1037/0033-2909.86.2.420.
Solmecke, G. (2000). Faktoren der Schwierigkeit von Hörtests [Factors determining difficulty in listening tests]. In S. Bolton (Ed.), TESTDAF: Grundlagen für die Entwicklung eines neuen Sprachtests (pp. 57–76). München: Goethe-Institut.
Spearman, C. C. (1910). Correlation calculated from faulty data. British Journal of Psychology, 3, 271–295. doi:10.1111/j.2044-8295.1910.tb00206.x.
Wainer, H., & Thissen, D. (1993). Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6, 103–118. doi:10.1207/s15324818ame0602_1.
Yen, W. (1984). Effects of local item dependence on the fit and equating performance of the three parameter logistic model. Applied Psychological Measurement, 8, 125–145. doi:10.1177/014662168400800201.
Acknowledgments
The preparation of this chapter was supported by grant HA5050/2-3 from the German Research Foundation (DFG) in the Priority Program “Competence Models for Assessing Individual Learning Outcomes and Evaluating Educational Processes” (SPP 1293).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Hartig, J., Harsch, C. (2017). Multidimensional Structures of Competencies: Focusing on Text Comprehension in English as a Foreign Language. In: Leutner, D., Fleischer, J., Grünkorn, J., Klieme, E. (eds) Competence Assessment in Education. Methodology of Educational Measurement and Assessment. Springer, Cham. https://doi.org/10.1007/978-3-319-50030-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-50030-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50028-7
Online ISBN: 978-3-319-50030-0
eBook Packages: EducationEducation (R0)