Skip to main content

Valid for the Elites? The Trade-Off Between Test Fairness and Test Validity

  • Chapter
  • First Online:
Book cover Revisiting the Assessment of Second Language Abilities: From Theory to Practice

Part of the book series: Second Language Learning and Teaching ((SLLT))

Abstract

The relationship between validity and fairness has been heatedly debated in the literature (Kunnan, 2010). The orthodoxy is that test fairness should be subsumed under validity; that is, a valid test ensures fairness. Tracking the test retrofits of a high stakes national language test in Iran, known as Specialized English Test (SET) used to determine admission to tertiary English language programs, and collecting data on test takers’ language learning experiences, this article argues against the established view that more valid tests would necessarily promote fairness. Being an achievement measure based on secondary school English curriculum, the previous version of the SET was widely criticized for its construct underrepresentation (Farhady & Hedayati, 2009) and fizziness. In its current version, the SET is more construct representative, for it goes beyond the high school curriculum and covers more areas of communicative competence. Data collected from 173 undergraduate students of English translation and literature in three national universities across the country revealed that an overwhelming majority of students come from families with high socio-economic status, with poorer students represented only in low-tire university student population. This finding indicates that the improvement in validity has come with a cost in fairness and social mobility; hence, reproducing existing social order by denying underprivileged applicants access to quality tertiary education language programs. The paper further discusses issues of test validity and fairness and calls for a broader understanding of test consequences within a larger sociocultural perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Ahmadi, A., & Darabi Bazvand, A. (2016). Gender differential item functioning on a national field-specific test: The case of PhD entrance exam of TEFL in Iran. Iranian Journal of Language Teaching Research, 4(1), 63–82.

    Google Scholar 

  • Angelelli, C. V., & Jacobson, H. E. (2009). Introduction: Testing and assessment in translation and interpreting studies: A call for dialogue between research and practice. In V. C. Angelelli & E. H. Jacobson (Eds.), Testing and assessment in translation and interpreting studies (pp. 1–9). Amsterdam: John Benjamins Publishing.

    Chapter  Google Scholar 

  • Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.

    Google Scholar 

  • Bachman, L. F. (2000). What, if any, are the limits of our responsibility for fairness in language testing? In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.

    Google Scholar 

  • Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and developing useful language tests. Oxford: Oxford University Press.

    Google Scholar 

  • Bachman, L. F., & Palmer, A. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford: Oxford University Press.

    Google Scholar 

  • Borsboom, D., & Mellenbergh, G. J. (2007). Test validity in cognitive assessment. In J. Leighton & M. Gierel (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 85–118). Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061.

    Article  Google Scholar 

  • Byczkiewicz, V. (2004). Filmic portrayals of cheating or fraud in examinations and competitions. Language Assessment Quarterly, 1(2&3), 193–204. https://doi.org/10.1080/15434303.2004.9671785.

    Article  Google Scholar 

  • Carlsen, C. (2009). Crossing the bridge from the other side: The impact of society on testing. In L. Taylor & C. J. Weir (Eds.), Language testing matters, Studies in language testing 31 (pp. 344–356). Cambridge: Cambridge University Press.

    Google Scholar 

  • Chalhoub-Deville, M. (1997). Theoretical models, assessment frameworks and test construction. Language Testing, 14(1), 3–22. https://doi.org/10.1177/026553229701400102.

    Article  Google Scholar 

  • Corson, D. (1997). Critical realism: An emancipatory philosophy for applied linguistics? Applied Linguistics, 18(2), 166–188. https://doi.org/10.1093/applin/18.2.166.

    Article  Google Scholar 

  • Davies, A. (1997). Introduction: The limits of ethics in language testing. Language Testing, 14(3), 235–241. https://doi.org/10.1177/026553229701400301.

    Article  Google Scholar 

  • Davies, A. (2010). Test fairness: A response. Language Testing, 27(2), 171–176. https://doi.org/10.1177/0265532209349466.

    Article  Google Scholar 

  • De Beauvoir, S. (2014). The second sex. New York: Random House.

    Google Scholar 

  • Educational Testing Service. (2002). ETS standards for quality and fairness. Princeton, NJ: Author.

    Google Scholar 

  • Farhady, H., & Hedayati, H. (2009). Language assessment policy in Iran. Annual Review of Applied Linguistics, 29, 132–141. https://doi.org/10.1017/S0267190509090114.

    Article  Google Scholar 

  • Fulcher, G. (2009). Test use and political philosophy. Annual Review of Applied Linguistics, 29, 3–20. https://doi.org/10.1017/S0267190509090023.

    Article  Google Scholar 

  • Fulcher, G. (2010). Practical language testing. London: Hodder Education.

    Google Scholar 

  • Fulcher, G. (2014). Language testing and philosophy. In A. J. Kunan (Ed.), The companion to language assessment (pp. 1–17). Boston: Wiley.

    Google Scholar 

  • Fulcher, G. (2015). Re-examining language testing: A philosophical and social inquiry. London: Routledge.

    Google Scholar 

  • Gao, L. (2011). Impacts of cultural capital on student college choice in China. Maryland: Lexington Books.

    Google Scholar 

  • Gipps, C. V. (1994). Beyond testing: Towards a theory of educational assessment. London: The Falmer Press.

    Google Scholar 

  • Gipps, C., & Stobart, G. (2009). Fairness in assessment. In C. Wyatt-Smith & J. Cumming (Eds.), Educational assessment in the 21st century (pp. 105–118). Netherlands: Springer.

    Chapter  Google Scholar 

  • Goldstein, H. (2012). Francis Galton, measurement, psychometrics and social progress. Assessment in Education: Principles, Policy & Practice, 19(2), 147–158. https://doi.org/10.1080/0969594X.2011.614220.

    Article  Google Scholar 

  • Green, A. (2014). Exploring language assessment and testing: Language in action. New York: Routledge.

    Google Scholar 

  • Hamp-Lyons, L. (1998). Ethical test preparation practice: The case of the TOEFL. TESOL Quarterly, 32(2), 329–337. https://doi.org/10.2307/3587587.

    Article  Google Scholar 

  • Hamp-Lyons, L. (2000). Fairness in language testing. In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.

    Google Scholar 

  • Hamp-Lyons, L. (2016). Washback, impact and validity: ethical concerns. Language Testing, 14 (3), 295–303

    Google Scholar 

  • Hidri, S. (2014). Developing and evaluating a dynamic assessment of listening comprehension in an EFL context. Language Testing in Asia, 4(4), 2–19. https://doi.org/10.1186/2229-0443-4-4.

    Google Scholar 

  • House, J. (2014). Translation quality assessment: Past and present. New York: Routledge.

    Book  Google Scholar 

  • Kane, M. (2010). Validity and fairness. Language Testing, 27(2), 177–182. https://doi.org/10.1177/0265532209349467.

    Article  Google Scholar 

  • Karami, H. (2013). The quest for fairness in language testing. Educational Research and Evaluation, 19(2–3), 158–169. https://doi.org/10.1080/13803611.2013.767618.

    Article  Google Scholar 

  • Klitgaard, R. (1985). Choosing elites: Selecting the “best and the brightest” at top universities and elsewhere. New York: Basic.

    Google Scholar 

  • Kunnan, A. J. (2000). Fairness and justice for all. In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.

    Google Scholar 

  • Kunnan, A. J. (2010). Test fairness and Toulmin’s argument structure. Language Testing, 27(2), 183–189. https://doi.org/10.1177/0265532209349468.

    Article  Google Scholar 

  • Kunnan, A. J. (2014). Fairness and justice in language assessment. In A. J. Kunan (Ed.), The companion to language assessment (pp. 1–17). Boston: Wiley.

    Google Scholar 

  • Lantolf, J. P., & Poehner, M. E. (2013). The unfairness of equal treatment: Objectivity in L2 testing and dynamic assessment. Educational Research and Evaluation, 19(2–3), 141–157. https://doi.org/10.1080/13803611.2013.767616.

    Article  Google Scholar 

  • Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. New York: Routledge.

    Google Scholar 

  • Larson-Hall, J. (2016). Our statistical intuitions may be misleading us: Why we need robust stat istics. Language Teaching, 45, 460–474. https://doi.org/10.1017/S0261444811000127.

    Article  Google Scholar 

  • McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Malden, MA: Blackwell.

    Google Scholar 

  • Mehrens, W. A., & Kaminsky, J. (1989). Methods for improving standardized test scores: Fruitful, fruitless, or fraudulent? Educational Measurement: Issues and Practice, 8(1), 14–22. https://doi.org/10.1111/j.1745-3992.1989.tb00304.x.

    Article  Google Scholar 

  • Messick, S. (1996). Validity and washback in language testing. Language Testing, 12(3), 241–256. https://doi.org/10.1002/j.2333-8504.1996.tb01695.x.

    Article  Google Scholar 

  • Moses, M. S., & Nanna, M. J. (2007). The testing culture and the persistence of high stakes testing reforms. Education and Culture, 23(1), 55–72.

    Article  Google Scholar 

  • Newton, P., & Shaw, S. (2014). Validity in educational and psychological assessment. Los Angeles, CA: Sage.

    Book  Google Scholar 

  • Pennycook, A. (2001). Critical applied linguistics: A critical introduction. Mahwah, NJ: Lawrence Erlbaum.

    Google Scholar 

  • O’Sullivan, B., & Weir, C. J. (2011). Test development and validation. In B. O’Sullivan (Ed.), Language testing theories and practices (pp. 13–32). Basingstoke: Palgrave Macmillan.

    Google Scholar 

  • Popham, W. J. (1991). Appropriateness of teachers’ test-preparation practices. Educational Measurement: Issues and Practice, 10(4), 12–15. https://doi.org/10.1111/j.1745-3992.1991.tb00211.x.

    Article  Google Scholar 

  • Pan, Y., & Roever, C. (2016). Consequences of test use: a case study of employers' voice on the social impact of English certification exit requirements in Taiwan. Language Testing in Asia, 6 (1), 1–21

    Google Scholar 

  • Potter, G., & Lopez, G. (2001). After postmodernism: The millennium. In J. Lopez & G. Poter (Eds.), After postmodernism: An introduction to critical realism (pp. 1–18). London: The Athlone Press.

    Google Scholar 

  • Razavipour, K. (2010). National matriculation test for English major students: Its impact and some validity evidence. Unpublished doctoral dissertation, Shiraz University, Shiraz, Iran.

    Google Scholar 

  • Scott, E. D. (2016). Assessment as a dimension of globalization. In S. Scott, D. E. Scott, & C. F. Webber (Eds.), Assessment in education: Implications for leadership (pp. 17–52). New York: Springer.

    Chapter  Google Scholar 

  • Shohamy, E. (1998). Critical language testing and beyond. Studies in Educational Evaluation, 24 (4), 331–345. https://doi.org/10.1016/S0191-491X(98)00020-0.

    Article  Google Scholar 

  • Shohamy, E. (2000). Fairness in language testing. In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.

    Google Scholar 

  • Shohamy, E. (2001). Democratic assessment as an alternative. Language Testing, 18(4), 373–391. https://doi.org/10.1177/026553220101800404.

    Article  Google Scholar 

  • Song, X. (2016). Fairness in educational assessment in China: Historical practices and contemporary challenges. In S. Scott, D. E. Scott, & C. F. Webber (Eds.), Assessment in education: Implications for leadership (pp. 67–90). New York: Springer.

    Chapter  Google Scholar 

  • Stansfield, C. W., & Winke, P. M. (2008). Testing aptitude for second language learning. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopaedia of language and education, Language testing and assessment (Vol. 7, 2nd ed., pp. 2226–2239). New York: Springer.

    Chapter  Google Scholar 

  • Stoneberg, B. D. (2004). A study of gender-based and ethnic-based differential item functioning (DIF) in the spring 2003 Idaho Standards Achievement Tests. Applying the Simultaneous Bias Test (SIBTEST) and the Mantel-Haenszel Chi Square Test. Paper for EDMS 889 Measurement-Statistics Practicum, University of Maryland, College Park. Retrieved from http://files.eric.ed.gov/fulltext/ED483777.pdf

  • Teo, A. (2013). Promoting EFL students’ inferential reading skills through computerized dynamic assessment. Language Learning & Technology, 16(3), 10–20.

    Google Scholar 

  • Walters, F. S. (2012). Fairness. In G. Fulcher & F. Davidson (Eds.), The Routledge handbook of language testing (pp. 469–478). London: Routledge.

    Google Scholar 

  • Weir, C. J. (2005). Language testing and validation. Hampshire: Palgrave McMillan.

    Book  Google Scholar 

  • Xi, X. (2010). How do we go about investigating test fairness? Language Testing, 27, 147–170. https://doi.org/10.1177/0265532209349465.

    Article  Google Scholar 

  • Zwick, R. (2012). Admissions testing in higher education. In C. Secolsky & D. B. Denison (Eds.), Handbook of measurement, assessment, and evaluation in higher education (pp. 382–404). New York: Routledge.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kioumars Razavipour .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Razavipour, K. (2018). Valid for the Elites? The Trade-Off Between Test Fairness and Test Validity. In: Hidri, S. (eds) Revisiting the Assessment of Second Language Abilities: From Theory to Practice. Second Language Learning and Teaching. Springer, Cham. https://doi.org/10.1007/978-3-319-62884-4_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62884-4_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62883-7

  • Online ISBN: 978-3-319-62884-4

  • eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics