Advertisement

Valid for the Elites? The Trade-Off Between Test Fairness and Test Validity

  • Kioumars RazavipourEmail author
Chapter
Part of the Second Language Learning and Teaching book series (SLLT)

Abstract

The relationship between validity and fairness has been heatedly debated in the literature (Kunnan, 2010). The orthodoxy is that test fairness should be subsumed under validity; that is, a valid test ensures fairness. Tracking the test retrofits of a high stakes national language test in Iran, known as Specialized English Test (SET) used to determine admission to tertiary English language programs, and collecting data on test takers’ language learning experiences, this article argues against the established view that more valid tests would necessarily promote fairness. Being an achievement measure based on secondary school English curriculum, the previous version of the SET was widely criticized for its construct underrepresentation (Farhady & Hedayati, 2009) and fizziness. In its current version, the SET is more construct representative, for it goes beyond the high school curriculum and covers more areas of communicative competence. Data collected from 173 undergraduate students of English translation and literature in three national universities across the country revealed that an overwhelming majority of students come from families with high socio-economic status, with poorer students represented only in low-tire university student population. This finding indicates that the improvement in validity has come with a cost in fairness and social mobility; hence, reproducing existing social order by denying underprivileged applicants access to quality tertiary education language programs. The paper further discusses issues of test validity and fairness and calls for a broader understanding of test consequences within a larger sociocultural perspective.

Keywords

Fairness Validity Ethics Construct representation Specialized English test 

References

  1. Ahmadi, A., & Darabi Bazvand, A. (2016). Gender differential item functioning on a national field-specific test: The case of PhD entrance exam of TEFL in Iran. Iranian Journal of Language Teaching Research, 4(1), 63–82.Google Scholar
  2. Angelelli, C. V., & Jacobson, H. E. (2009). Introduction: Testing and assessment in translation and interpreting studies: A call for dialogue between research and practice. In V. C. Angelelli & E. H. Jacobson (Eds.), Testing and assessment in translation and interpreting studies (pp. 1–9). Amsterdam: John Benjamins Publishing.CrossRefGoogle Scholar
  3. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.Google Scholar
  4. Bachman, L. F. (2000). What, if any, are the limits of our responsibility for fairness in language testing? In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.Google Scholar
  5. Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice: Designing and developing useful language tests. Oxford: Oxford University Press.Google Scholar
  6. Bachman, L. F., & Palmer, A. (2010). Language assessment in practice: Developing language assessments and justifying their use in the real world. Oxford: Oxford University Press.Google Scholar
  7. Borsboom, D., & Mellenbergh, G. J. (2007). Test validity in cognitive assessment. In J. Leighton & M. Gierel (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 85–118). Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  8. Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071.  https://doi.org/10.1037/0033-295X.111.4.1061.CrossRefGoogle Scholar
  9. Byczkiewicz, V. (2004). Filmic portrayals of cheating or fraud in examinations and competitions. Language Assessment Quarterly, 1(2&3), 193–204.  https://doi.org/10.1080/15434303.2004.9671785.CrossRefGoogle Scholar
  10. Carlsen, C. (2009). Crossing the bridge from the other side: The impact of society on testing. In L. Taylor & C. J. Weir (Eds.), Language testing matters, Studies in language testing 31 (pp. 344–356). Cambridge: Cambridge University Press.Google Scholar
  11. Chalhoub-Deville, M. (1997). Theoretical models, assessment frameworks and test construction. Language Testing, 14(1), 3–22.  https://doi.org/10.1177/026553229701400102.CrossRefGoogle Scholar
  12. Corson, D. (1997). Critical realism: An emancipatory philosophy for applied linguistics? Applied Linguistics, 18(2), 166–188.  https://doi.org/10.1093/applin/18.2.166.CrossRefGoogle Scholar
  13. Davies, A. (1997). Introduction: The limits of ethics in language testing. Language Testing, 14(3), 235–241.  https://doi.org/10.1177/026553229701400301.CrossRefGoogle Scholar
  14. Davies, A. (2010). Test fairness: A response. Language Testing, 27(2), 171–176.  https://doi.org/10.1177/0265532209349466.CrossRefGoogle Scholar
  15. De Beauvoir, S. (2014). The second sex. New York: Random House.Google Scholar
  16. Educational Testing Service. (2002). ETS standards for quality and fairness. Princeton, NJ: Author.Google Scholar
  17. Farhady, H., & Hedayati, H. (2009). Language assessment policy in Iran. Annual Review of Applied Linguistics, 29, 132–141.  https://doi.org/10.1017/S0267190509090114.CrossRefGoogle Scholar
  18. Fulcher, G. (2009). Test use and political philosophy. Annual Review of Applied Linguistics, 29, 3–20.  https://doi.org/10.1017/S0267190509090023.CrossRefGoogle Scholar
  19. Fulcher, G. (2010). Practical language testing. London: Hodder Education.Google Scholar
  20. Fulcher, G. (2014). Language testing and philosophy. In A. J. Kunan (Ed.), The companion to language assessment (pp. 1–17). Boston: Wiley.Google Scholar
  21. Fulcher, G. (2015). Re-examining language testing: A philosophical and social inquiry. London: Routledge.Google Scholar
  22. Gao, L. (2011). Impacts of cultural capital on student college choice in China. Maryland: Lexington Books.Google Scholar
  23. Gipps, C. V. (1994). Beyond testing: Towards a theory of educational assessment. London: The Falmer Press.Google Scholar
  24. Gipps, C., & Stobart, G. (2009). Fairness in assessment. In C. Wyatt-Smith & J. Cumming (Eds.), Educational assessment in the 21st century (pp. 105–118). Netherlands: Springer.CrossRefGoogle Scholar
  25. Goldstein, H. (2012). Francis Galton, measurement, psychometrics and social progress. Assessment in Education: Principles, Policy & Practice, 19(2), 147–158.  https://doi.org/10.1080/0969594X.2011.614220.CrossRefGoogle Scholar
  26. Green, A. (2014). Exploring language assessment and testing: Language in action. New York: Routledge.Google Scholar
  27. Hamp-Lyons, L. (1998). Ethical test preparation practice: The case of the TOEFL. TESOL Quarterly, 32(2), 329–337.  https://doi.org/10.2307/3587587.CrossRefGoogle Scholar
  28. Hamp-Lyons, L. (2000). Fairness in language testing. In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.Google Scholar
  29. Hamp-Lyons, L. (2016). Washback, impact and validity: ethical concerns. Language Testing, 14 (3), 295–303Google Scholar
  30. Hidri, S. (2014). Developing and evaluating a dynamic assessment of listening comprehension in an EFL context. Language Testing in Asia, 4(4), 2–19.  https://doi.org/10.1186/2229-0443-4-4.Google Scholar
  31. House, J. (2014). Translation quality assessment: Past and present. New York: Routledge.CrossRefGoogle Scholar
  32. Kane, M. (2010). Validity and fairness. Language Testing, 27(2), 177–182.  https://doi.org/10.1177/0265532209349467.CrossRefGoogle Scholar
  33. Karami, H. (2013). The quest for fairness in language testing. Educational Research and Evaluation, 19(2–3), 158–169.  https://doi.org/10.1080/13803611.2013.767618.CrossRefGoogle Scholar
  34. Klitgaard, R. (1985). Choosing elites: Selecting the “best and the brightest” at top universities and elsewhere. New York: Basic.Google Scholar
  35. Kunnan, A. J. (2000). Fairness and justice for all. In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.Google Scholar
  36. Kunnan, A. J. (2010). Test fairness and Toulmin’s argument structure. Language Testing, 27(2), 183–189.  https://doi.org/10.1177/0265532209349468.CrossRefGoogle Scholar
  37. Kunnan, A. J. (2014). Fairness and justice in language assessment. In A. J. Kunan (Ed.), The companion to language assessment (pp. 1–17). Boston: Wiley.Google Scholar
  38. Lantolf, J. P., & Poehner, M. E. (2013). The unfairness of equal treatment: Objectivity in L2 testing and dynamic assessment. Educational Research and Evaluation, 19(2–3), 141–157.  https://doi.org/10.1080/13803611.2013.767616.CrossRefGoogle Scholar
  39. Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. New York: Routledge.Google Scholar
  40. Larson-Hall, J. (2016). Our statistical intuitions may be misleading us: Why we need robust stat istics. Language Teaching, 45, 460–474.  https://doi.org/10.1017/S0261444811000127.CrossRefGoogle Scholar
  41. McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Malden, MA: Blackwell.Google Scholar
  42. Mehrens, W. A., & Kaminsky, J. (1989). Methods for improving standardized test scores: Fruitful, fruitless, or fraudulent? Educational Measurement: Issues and Practice, 8(1), 14–22.  https://doi.org/10.1111/j.1745-3992.1989.tb00304.x.CrossRefGoogle Scholar
  43. Messick, S. (1996). Validity and washback in language testing. Language Testing, 12(3), 241–256.  https://doi.org/10.1002/j.2333-8504.1996.tb01695.x.CrossRefGoogle Scholar
  44. Moses, M. S., & Nanna, M. J. (2007). The testing culture and the persistence of high stakes testing reforms. Education and Culture, 23(1), 55–72.CrossRefGoogle Scholar
  45. Newton, P., & Shaw, S. (2014). Validity in educational and psychological assessment. Los Angeles, CA: Sage.CrossRefGoogle Scholar
  46. Pennycook, A. (2001). Critical applied linguistics: A critical introduction. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
  47. O’Sullivan, B., & Weir, C. J. (2011). Test development and validation. In B. O’Sullivan (Ed.), Language testing theories and practices (pp. 13–32). Basingstoke: Palgrave Macmillan.Google Scholar
  48. Popham, W. J. (1991). Appropriateness of teachers’ test-preparation practices. Educational Measurement: Issues and Practice, 10(4), 12–15.  https://doi.org/10.1111/j.1745-3992.1991.tb00211.x.CrossRefGoogle Scholar
  49. Pan, Y., & Roever, C. (2016). Consequences of test use: a case study of employers' voice on the social impact of English certification exit requirements in Taiwan. Language Testing in Asia, 6 (1), 1–21Google Scholar
  50. Potter, G., & Lopez, G. (2001). After postmodernism: The millennium. In J. Lopez & G. Poter (Eds.), After postmodernism: An introduction to critical realism (pp. 1–18). London: The Athlone Press.Google Scholar
  51. Razavipour, K. (2010). National matriculation test for English major students: Its impact and some validity evidence. Unpublished doctoral dissertation, Shiraz University, Shiraz, Iran.Google Scholar
  52. Scott, E. D. (2016). Assessment as a dimension of globalization. In S. Scott, D. E. Scott, & C. F. Webber (Eds.), Assessment in education: Implications for leadership (pp. 17–52). New York: Springer.CrossRefGoogle Scholar
  53. Shohamy, E. (1998). Critical language testing and beyond. Studies in Educational Evaluation, 24 (4), 331–345.  https://doi.org/10.1016/S0191-491X(98)00020-0.CrossRefGoogle Scholar
  54. Shohamy, E. (2000). Fairness in language testing. In A. J. Kunan (Ed.), Fairness and validation in language assessment (pp. 39–41). Cambridge: Cambridge University Press.Google Scholar
  55. Shohamy, E. (2001). Democratic assessment as an alternative. Language Testing, 18(4), 373–391.  https://doi.org/10.1177/026553220101800404.CrossRefGoogle Scholar
  56. Song, X. (2016). Fairness in educational assessment in China: Historical practices and contemporary challenges. In S. Scott, D. E. Scott, & C. F. Webber (Eds.), Assessment in education: Implications for leadership (pp. 67–90). New York: Springer.CrossRefGoogle Scholar
  57. Stansfield, C. W., & Winke, P. M. (2008). Testing aptitude for second language learning. In E. Shohamy & N. H. Hornberger (Eds.), Encyclopaedia of language and education, Language testing and assessment (Vol. 7, 2nd ed., pp. 2226–2239). New York: Springer.CrossRefGoogle Scholar
  58. Stoneberg, B. D. (2004). A study of gender-based and ethnic-based differential item functioning (DIF) in the spring 2003 Idaho Standards Achievement Tests. Applying the Simultaneous Bias Test (SIBTEST) and the Mantel-Haenszel Chi Square Test. Paper for EDMS 889 Measurement-Statistics Practicum, University of Maryland, College Park. Retrieved from http://files.eric.ed.gov/fulltext/ED483777.pdf
  59. Teo, A. (2013). Promoting EFL students’ inferential reading skills through computerized dynamic assessment. Language Learning & Technology, 16(3), 10–20.Google Scholar
  60. Walters, F. S. (2012). Fairness. In G. Fulcher & F. Davidson (Eds.), The Routledge handbook of language testing (pp. 469–478). London: Routledge.Google Scholar
  61. Weir, C. J. (2005). Language testing and validation. Hampshire: Palgrave McMillan.CrossRefGoogle Scholar
  62. Xi, X. (2010). How do we go about investigating test fairness? Language Testing, 27, 147–170.  https://doi.org/10.1177/0265532209349465.CrossRefGoogle Scholar
  63. Zwick, R. (2012). Admissions testing in higher education. In C. Secolsky & D. B. Denison (Eds.), Handbook of measurement, assessment, and evaluation in higher education (pp. 382–404). New York: Routledge.Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of English Language and LiteratureCollege of Literature and Humanities, Shahid Chamran University of AhvazAhvazIran

Personalised recommendations