Test-takers’ perspectives on a global test of English: questions of fairness, justice and validity
- 642 Downloads
Although language test-takers have been the focus of much theoretical and empirical work in recent years, this work has been mainly concerned with their attitudes to test preparation and test-taking strategies, giving insufficient attention to their views on broader socio-political and ethical issues. This article examines test-takers’ perceptions and evaluations of the fairness, justice and validity of global tests of English, with a particular focus upon the International English Language Testing System (IELTS). Based on relevant literature and theorizing into such tests, and on self-reported test experience data gathered from test-takers (N = 430) from 49 countries, we demonstrate how test-takers experienced fairness and justice in complex ways that problematized the purported technical excellence and validity of IELTS. Even as there was some evidence of support for the test as a fair measure of students’ English capacity, the extent to which it actually reflected their language capabilities was open to question. At the same time, the participants expressed concerns about whether IELTS was a vehicle for raising revenue and for justifying immigration policies, thus raising questions about the justness of the test. The research foregrounds the importance of focusing attention upon the socio-political and ethical circumstances that currently attend large-scale, standardized English language testing.
KeywordsEnglish language testing IELTS Globalization Test-taker perspectives Fairness Justice Validity
American Education Research Association
American Psychological Association
English as a Second Language
International Development Program of Australian Universities and Colleges Ltd
International English Language Testing System
National Council of Measurement in Education
Target Language Use
Teaching English as a Foreign Language
Language test-takers have been the focus of much theoretical and empirical work in recent years. This work has been mainly concerned with their attitudes to test preparation and test-taking strategies (e.g. Cheng & DeLuca, 2011), giving insufficient attention to their views/perceptions on broader politics and ethics attending such tests. This article reports on an International English Language Testing System (IELTS) study in an Australian university, focusing particularly upon test-takers’ perceptions and evaluations of the test from the perspective of the interrelated concepts of fairness, justice and validity.
serve as both door-openers and gate-keepers. That is, decisions that are made on the basis of language assessments will involve allocating resources, opportunities, or rewards to some while denying these to others. (Bachman & Purpura, 2008, p. 456)
Is it just for a university to demand that international L2 students meet language requirements that are not met by all L1 students, who are exempt from taking the test? Is it just for a country to raise the language requirements for citizenship to a literacy level that de facto excludes people who have not had access to organized education or schooling? [ … ]. (p. 143, emphases added)
How test-takers perceive English proficiency tests that entail socio-political and ethical questions in terms of fairness, justice and validity, and what kind of experiences they have of test-taking and test impact deserve critical scrutiny, particularly given that studies have shown that test results mediate global mobility (Ahearn, 2009; Deygers, 2017; Hamid, Hoang & Kirkpatrick, 2019; Hoang & Hamid, 2017). As would be expected, the language testing literature has given significant attention to test fairness and validity; the concept of justice has also gained attention in the past decade (see Kunnan, 2014). However, while scholars and researchers engage with fairness, justice and validity in an intellectual sense, test-takers experience the consequences of different degrees of (un) fairness, (in) justice and (in) validity in very material ways. Therefore, understanding test-takers’ perspectives/perceptions may help access their lived experiences of fairness, justice and validity which may lead to more socially responsive enactment of language testing and assessment. The growing research attention given to IELTS test-takers, but the relatively inadequate understanding of broader conceptions of fairness, justice and validity from test-takers’ perspectives, were the primary motivations for undertaking the research reported in this article.
IELTS in a globalized world
IELTS is a global test of English, which is jointly owned by the British Council, IDP (International Development Program of Australian Universities and Colleges Ltd): IELTS Australia1 and Cambridge English Language Assessment. It tests English as a second language (ESL) proficiency in the areas of listening, reading, writing and speaking, and reports test-takers’ performances on 9-point band scores (1 = Non user and 9 = Expert user of English). Raw scores are calculated into an aggregated score for each of the four components, and then a single score, combining the results of the four modules, is assigned to each test-taker. IELTS is divided into two test types: Academic and General Training, both following the same scoring and reporting procedures. The IELTS website provides details on the testing and scoring procedures, emphasizing their fairness, reliability and trustworthiness. Since its introduction in 1989, IELTS authorities have carried out or commissioned research on various aspects of the test and research findings have led to multiple revisions in the past two decades (see Chalhoub-Deville & Turner, 2000; Stoynoff, 2009). Fairness and test impact have received considerable attention in the IELTS authorized research (see Hawkey, 2006; Hyatt, 2013).
The IELTS website claims that IELTS ‘is the world’s most popular … English language test’ (see https://www.idp.com/global/ielts) for study, work and migration, with over 3 million tests taken in 2017. Test-takers can take the test in more than 1100 locations in over 140 countries. The test’s acceptability has also been extended globally with the number of institutional test-users exceeding 9000 entities, including schools, universities, employers, immigration authorities and professional bodies in both traditional and emerging English-speaking nations. Initially introduced to assess language proficiency of ESL students seeking admission to universities in English-speaking countries, IELTS has recently been assigned a gate-keeping role for immigration and employment purposes in Australia, Canada and the UK (see Ahearn, 2009; Hamid et al., 2019; Hoang & Hamid, 2017; O’Loughlin, 2011). Educational institutions and immigration departments in these countries have set up different IELTS score requirements for prospective students and visa applicants (see Hyatt, 2013).
While the global expansion of IELTS may help enhance its technical quality and recognition, drawing upon its accumulated resources, expertise and discourses of global trust,2 the expansion may also raise some concerns. For example, the continued expansion may affect its educational applicability, reinforcing its performative ‘business-oriented’ purposes (Davidson, 1993; Templer, 2004). While educational commodification is a global concern (Luke, 2004), there may be questions about whether and to what extent profit motives guiding IELTS and similar tests align with their stated goals of measuring language proficiency. There may also be concerns about whether tests driven by a profit-motive pose social, educational and ethical issues and challenges for stakeholders including test owners, researchers and language educators (see Sarich, 2012 for an examination of these issues in relation to external standardized tests in Japan).
As a transnational test operating in a globalized world, IELTS seeks standardization of the test and testing procedures. The test is produced in a center based in the UK. The hundreds of test sites located across the world have been set up as the operational units which administer the test, following written protocols for test administration produced by the center. It can be argued that this centralized operation has ensured test fairness; no matter where test-takers are located, they receive the same test input under comparable conditions. However, while this one-size-fits-all approach may be construed as fostering ‘fairness’, this also ignores that test-takers come from different socioeconomic, sociolinguistic and sociocultural backgrounds with variable interests, motivations, strategies and experiences of learning and using English in different social contexts.
The test’s emphasis on a particular variety of English is also reflective of its centrist tendencies. Although IELTS claims to have adopted ‘international English’ in response to the changing face of English (Taylor, 2006, 2010), its definition of ‘international’ has a narrow scope, and actually refers to what it called ‘native’ varieties of English; within IELTS, these are understood as British, American, Australian and New Zealand English.3Other varieties of English such as Indian or Malaysian or Singaporean English are yet to be substantively considered (Davies, 2009). The IELTS model of English potentially benefits those who have access to both metropolitan and ‘internationally-accepted’ varieties of English, and disadvantages those who speak only local varieties. Moreover, it undermines the diversity of Englishes that actually exists in an increasingly fluid world, thereby reproducing a linguistic hierarchy which discriminates against ‘non-native’ Englishes (Kachru, 1992).
What should drive test design? Should it be characteristics of the people taking the test, or should it be the purpose of the test and the decisions being made with it? (Brown, 2004, p. 319)
If IELTS is meant for L2 test-takers, and if the language itself has been transformed into multiple varieties globally, it may be unfair to test people in a variety that many participants are not exposed to in their social and linguistic environments. However, a more varied conception of English underpinning IELTS might be seen as inappropriate when the test purpose is taken into consideration; it could be argued that when the target language use (TLU) domain is characterized by ‘native’ English, this becomes an acceptable criterion upon which to judge test-takers’ English capacity. In this sense, excluding ‘non-native’ Englishes from IELTS may appear reasonable.
Nevertheless, this latter argument seems weak because the TLU domain in ‘native’ English-speaking countries has now become a meeting place of students, academics and migrants from all over the world, many of whom speak a wide variety of Englishes (Batziakas, 2017). The test construct defined with reference to traditional conceptions of the TLU domain may no longer be acceptable, since the linguistic landscape has significantly changed in these contexts, raising questions about whose variety of English should be included in the definition of proficiency, and how such variability may affect considerations of test performance.
Finally, the use of IELTS to control global flows of people may also be of concern. Although the danger of linguistic deficiency in border-crossing cannot be underestimated (Piller & Takahashi, 2011), the question of whether language should be given a gate-keeping role in restricting people’s mobility raises ethical challenges (Deygers, 2017 as cited above; see also Capstick, 2011). If the test opens up opportunities4for those who are successful, it may be closing down opportunities to those who do not succeed against more standardized measures (see Bachman & Purpura, 2008).
In sum, the location of IELTS in a globalized world, its use as a gate-keeper of global mobility and the variety of English used in the test and actual Englishes used in TLU domains call for further investigation into issues of fairness, justice and validity from the perspectives of those who are directly affected by test outcomes. This article reports on data from an IELTS study to understand test-takers’ perspectives on fairness, justice and validity, and to give voice to test-takers on social, political and ethical grounds more broadly.
Fairness, justice and validity in language testing
Language testers who have [ … ] written on justice, have grappled with theoretically disentangling it from fairness, or determining its relationship with validity. (p. 147)
The interest has generally drawn on Messick’s (1989) unified view of validity (e.g. McNamara & Ryan, 2011) and/or standards for educational assessment (AERA, APA,, & NCME, 2014; see also Kunnan, 2000, 2004, 2008, 2014; Taylor, 2010). However, scholars have provided varied understandings, interpretations and perspectives on the concepts of fairness and justice and how they relate to validity (see Davies, 2010; Kane, 2010; Kunnan, 2000, 2004, 2008, 2014; McNamara & Roever, 2006; McNamara & Ryan, 2011; Weir, 2005; Xi, 2010).
Kunnan is credited with highlighting the ‘primacy of fairness’ within ‘a framework of social justice’ (Kunnan, 2000, p. 1). Inspired by ethical concerns in language testing and drawing on professional standards and codes of practice, he initially defined fairness as comprising three elements: validity, access and justice. He subsequently revised these elements and included new items in his test fairness framework (Kunnan, 2004, 2008) which comprises validity, absence of bias, access, administration and social consequences.
For Kunnan (2004, 2008), fairness is a super-ordinate concept which subsumes validity. One implication of this conceptualization is fairness issues cannot be addressed within the framework of validity and therefore need to be investigated separately. This is the second of the three types of fairness-validity relationships discussed by Xi (2010): (1) fairness as an independent test quality, (2) fairness as all-encompassing, and (3) fairness as directly linked to validity. Xi’s (2010) own proposal for investigating fairness is informed by the third type. While Kunnan (2010) critiques Xi’s proposal for its limited understanding of fairness, Davies (2010) emphasizes the supremacy of test validation, thus rendering the linking of fairness argument to validity underpinning Xi’s proposal redundant.
Kunnan’s (2004, 2008) conceptualization of fairness can be seen as responsive to concerns about procedural fairness, but the conceptualization may also be utilized to cultivate substantive fairness (Kane, 2010). The former requires that test-takers be tested in essentially the same way under the same conditions, while the latter necessitates that score interpretations and test-based decisions be reasonable and equally appropriate for all test-takers.
Sub-principle 1: A test ought to have comparable construct validity in terms of its test-score interpretation for all test-takers.
Sub-principle 2: A test ought not to be biased against any test-taker groups, in particular by assessing construct-irrelevant matters. (Kunnan, 2008, p. 14)
The principle of justice here refers to what Xi (2010) terms comparable validity. Kunnan here seems more concerned with what Lam (1995) calls the equality view of fairness. Lam (1995) also mentions the equity view of fairness, which is antithetical to equality because instead of being concerned with comparable procedures and outcomes, an ‘equitable assessment is tailored to the individual student’s instruction context and social background’ (n. p.).
In short, in conceptualizing fairness, many scholars have taken what can be called a nothing-beyond-the-test position, even as they may believe themselves to be adopting justified approaches. Such narrower approaches provide little scope for asking questions about the socio-political purposes of tests.
By fairness, here we mean the extent to which the test quality, especially its psychometric quality, ensures procedural equality for individual and subgroups of test-takers and the adequacy of the test representation of the construct in test materials and procedures. (McNamara & Ryan, 2011, p. 163)
any problem with the test may not inhere in its quality, but its very existence and use in the first place, no matter how technically sophisticated and ‘fair’ in the narrow sense it may be. (McNamara & Ryan, 2011, p. 164)
attempts to provide principled bases for fairness and justice as applied to the institution of assessment. It does this by applying the idea of fairness as relating to persons—how assessments ought to be fair to test takers—and the idea of justice as relating to institutions—how institutions ought to be just to test takers. (Kunnan, 2014, p. 2, italics original)
McNamara and Ryan acknowledge that the social concerns that are brought within this scope of justice are derived from their interpretations of Messick’s (1989) facets of validity. Messick defined validity as ‘an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment’ (p. 13; emphasis original). This view of validity marks a departure from earlier conceptualizations which considered validity as a property of tests as indicated by the concept of ‘construct validity’ that refers to whether tests measure what they intend to measure. For McNamara and Ryan (2011), construct validity is an element of fairness as it refers to technical issues. To them, neither fairness (as discussed above) nor validity can guarantee justice. Justice demands validity which, in turn, is contingent on fairness, but a fair and valid test may also be imposed unjustly on test-takers (Deygers, 2017).
McNamara and Ryan’s view of justice owes much to Shohamy’s (2001) work on critical language testing (CLT) which aims to expose the ‘potential and real injustice of tests, rather than of critiquing their psychometric qualities as the principal source of their illegitimacy’ (McNamara & Ryan, 2011, p. 165). Their view is also comparable to the principle of beneficence guiding Kunnan’s (2008) later test fairness framework, which states that a ‘test ought to bring about good in society, that is, it should not be harmful or detrimental to society’ (p. 14). The test context framework that Kunnan (2008) introduced to complement his earlier test fairness framework incorporates this principle which is ‘necessary to examine tests and testing practice from a wide context in order to more fully determine whether and how these tests are beneficial or detrimental to society’ (p. 14). Kunnan’s (2014) revised principle of justice also emphasizes that an assessment institution ‘ought to be just and bring about benefits in society’ (p. 8).
Fairness, as referring to test-internal technical quality which ensures that test-takers are treated equally and are given equal opportunity to demonstrate their best performance.
Validity, as referring to the adequacy, appropriateness and justification of decisions made on the basis of test scores.
Justice, as referring to test-external issues to ensure that the use of tests for their stated purposes is justified and their introduction does not have harmful impacts on society.
Fairness, justice and validity are interdependent qualities which are not adequate on their own; affecting one necessarily affects the other two. Achieving an optimum balance between them would be a reasonable goal keeping in mind the usefulness and practicality of testing.
Fairness, justice and validity from test-takers’ perspectives
The importance of understanding test-takers’ perspectives has been emphasized in the literature. For example, Weir’s (2005) socio-cognitive framework for an evidence-based validation model has an important focus on how test-takers’ physical, psychological and experiential characteristics can be taken into consideration. O’Sullivan and Green (2011) provide further details on these domains of test-taker characteristics. However, these models, as well as much of the empirical work, are guided by fairness and validation in a narrow sense. Peirce and Stein (1995) demonstrated how a multiple-choice test conditioned a group of Black students in a South African school to submit to meanings expected by the test regime at the expense of their own meanings, histories and experiences. However, guided by validity in a procedural sense in the selection of test content, they gave little attention to the dehumanizing potential of tests more broadly. Working with IELTS test-takers and teachers, Hawkey (2006, 2008) investigated how these stakeholders perceived the test in terms of, among other issues, fairness, difficulty and test anxiety. Of note was the inclusion in the survey of this structured question: Do you think IELTS is a fair way to test your English proficiency? While the study’s engagement with students and teachers from a fairness point of view is welcome, again, it can be argued that the focus here was on procedural fairness, although the way the question was posed left room for multiple interpretations.
However, the real power is held not by the stakeholders but by the testing agencies. These agencies are profiting from the parental worry over their children’s future educational opportunities. (Chik & Besser, 2011, p. 88)
This article reports on data from an IELTS study to add to this body of work on socio-political and ethical issues in language testing from test-takers’ perspectives.
Context, participants and data collection methods
The article is based on a larger study into language test-takers’ test-taking experiences, undertaken at a major, metropolitan Australian university5where the authors work. The participants (N = 430, female = 45.3%, male = 54.7%) were from 49 countries including five ‘native’ speakers of English (three from UK and two from Ireland). About 60% of the test-takers were studying at the time of the research and 70% took the test for study purposes. Well over half of the sample took the test at least twice and a considerable proportion of participants took the test three, four or five times. Finally, the largest proportion of test-takers (about 80%) was from the more successful group who scored between 6.5 and 9 on the IELTS band scale.
The main instrument for data collection was a survey for the larger study available in both online and paper versions (see Hamid, 2014 for details). The survey questionnaire was organized into three sections in which the first asked for information on test-takers’ national backgrounds, current occupational/academic status, the reasons for taking IELTS and scores obtained. The second part contained 40 structured items on test-taking experience, fairness and validity and the socio-politics of IELTS, and the varieties of English used in the test. The final section contained an open-ended question inviting participants to comment on other issues and/or provide suggestions for test improvement.6 The present article is based on test-takers’ responses to this open question. These responses were particularly important given that test-takers freely expressed their views on many aspects of the test without being constrained by space or response format.
Of the 430 survey participants, 343 (80%) volunteered written comments, producing a corpus of 18,500 words. On average, each response contained 53 words, the longest one comprised 536 words, and the shortest one just two words. In part, and following procedures for qualitative coding and content analysis (Corbin & Strauss, 2008; Dörnyei, 2007), the responses were read repeatedly and coded at the level of phrases, sentences and paragraphs, following a broadly inductive approach. A colleague, whose research focuses on test-takers’ perspectives on international proficiency tests including IELTS and TOEFL, read all responses and the codes and she agreed with over 98% of the coding. At the same time, and reflecting how research is also always an active process reflecting the interests of the researchers, and not some sort of ‘objective’ exercise, key codes were analysed in light of relevant theorizing and literature on IELTS, pertaining to conceptions of fairness, justice and validity, as outlined above. In this way, the data analysis process involved simultaneous processes of engagement with theorizing and data involving both inductive and deductive processes, in light of our particular focus.
More broadly, the research is underpinned by critical and constructivist views that recognize people’s voice and agency in making sense of their experiences, and of the necessarily socially situated and embedded nature of such responses. Positivistic views may flag subjectivity and potential bias in the representations of selves and personal experiences; these ‘perception’ data may also not be considered hard evidence from such epistemological perspectives. However, being guided by perspectives that respect individual agency, and research as an always active process on the part of the researcher (Bourdieu & Wacquant, 1992), we believe that ‘people’s reasons and accounts provide evidence whose status is ontologically real’ (Corson, 1998, p. 63, referring to Bhaskar, 1986), even as this ‘reality’ is also simultaneously an act of construction through the research process. Participants’ voices and representations or their ‘perceptions’ are important resources to help develop an understanding of fairness, justice and validity from test-takers’ lived experiences.
Test-takers’ perceptions covered a variety of issues related to evaluations of the test (e.g. ‘I think it is ok and I have nothing to complain about it’), their experience of taking the test (‘Taking IELTS was not a pleasant experience’), the test fee, the duration of its score validity, the relevance of ‘non-native’ varieties of English, their perceptions of the test’s ability to measure language proficiency, their suggestions for test improvement and issues of fairness, justice and validity. In this article, their responses to these latter issues are of particular interest, and were expressed in relation to whether IELTS was construed as beneficial, whether IELTS was seen as an accurate measure of English proficiency and whether decisions made based on test-scores were justified and the purpose of the test. We endeavour to show how notions of fairness, justice and validity were intertwined within these themes, even as we employ a degree of analytical separation to assist the reader in following our arguments.
IELTS as a beneficial test?
It is therefore imperative to have good English (good IELTS score) before even starting their [international students’] studies. A[s] it currently stands, I’ve seen a lot of students graduating with abysmal level of English [...] For this reason, I support the immigration policy that requires you to retake IELTS when applying for skilled migration visa, even if you graduated in Australia. (emphasis added)
From this respondent’s perspective, the test was beneficial and justified (just) because it provided an ‘accurate’ measure of test-takers’ language ability (good IELTS scores equal good English). This view seems to represent that fairness and validity equal justice, which McNamara and Ryan (2011) as well as Deygers (2017) might consider a misrepresentation. Three other test-takers not only believed in the test’s ability to measure their language proficiency but also argued that IELTS and TOEFL (Test of English as a Foreign Language, another global test of English) should be merged so that there is one English test in the world, with universal standards and assessment criteria. On such renderings, broader conceptions of beneficence guiding Kunnan’s (2008) test fairness framework, with its emphasis upon how a ‘test ought to bring about good in society’, and that ‘it should not be harmful or detrimental to society’ (p. 14) did not bear upon these respondents’ understandings of the test (see also Kunnan, 2014).
The necessity of maintaining such standards—understood as a requirement for ‘fairness’—was the basis for the rejection of ‘non-native’ Englishes, and these respondents’ insistence on British English. Eighteen respondents were against alternatives to standard/‘native’ English varieties, compared to eight respondents who supported their inclusion. Only one respondent upheld a mixed position that suggested that ‘non-native’ varieties of Englishes could be included in speaking, but not in the other components (see also Hamid, 2014). The dominance of ‘native’ English language varieties, but alongside a considerable advocacy for other Englishes, reflects a point of possibility for fostering a more responsive approach to the plurality of Englishes that actually characterizes English speech and writing throughout the world (Kachru, 1992; Kirkpatrick, 2010). Perhaps a broader conception of validity was evident in such a response, associated with notions of adequacy or appropriateness of inferences pertaining to fairness in relation to test scores, as evident in McNamara and Ryan’s (2011) and Messick (1989) understandings.
IELTS as an accurate measure of English proficiency?
I personally believe that the structure and design of present IELTS examination is good. But I think that it does not measure one’s ability in English. (R11)
I do not want to go against the IELTS test. But, I want to say that the test modules should be designed [in] such a way that it really helps to check the test takers’ English proficiency. It means that if a person scores a band score, it should reflect his [sic] real skill of English. Thirdly, the test should be designed in such a way that persons scoring [the] same band score should have the same level of proficiency. (R69)
You need to know exactly how the test works to score highly, which means it isn’t testing your English ability—it’s testing your ability to pass IELTS.
The test has recently lost its value (or both reliability and validity) as it has claimed. It doesn't really reflect how proficient a test taker is [...] A score doesn't mean a real score … it is a minimum license for one to get into an English-speaking environment. Is what happens in a test similarly what happens outside? I'm doubtful with its authenticity. (R293)
I am not sure if the result of the IELTS test is an accurate assessment of a person’s English proficiency [...] Some of my students did not achieve the score they needed at their first attempt and they decided to take the test the second time in the hope of improving their score. However, after a lot of days taking practice tests and revising for the exam, the results of their second IELTS test were even lower than the first exam.
I would say that IELTS exam is not a very good criterion to test peoples’ language abilities. As you know many people take the exam twice in a row just in two weeks and their first score has a significant difference with the second one. In my opinion it shows that the test is not reliable. (R350)
For example, I took the IELTS test three times in the last two years looking for a 7 in all the bands for immigration purposes. The first time I got 8 in writing and 6.5 in speaking. The second time (around 6 months later) my results swapped: I got 8 in speaking and 6.5 in writing. The third time, I finally got 7 in both components. Does it make any sense? It doesn't for me... (R271)
[...] my overall score was ok but I got slightly less score in a particular module. But not consistently in one module. For example one time I got 6.5 in writing but 5.5 in speaking but in the later test, my score is OK in speaking but I got less in writing. So this is a problem. (R211)
This respondent did score band 7, his target, in all four components, but he did not score all 7 s in the same sitting. Such situations, which were common in the data, illustrate how technical issues (e.g. reliability and fairness) could be linked to justice issues (e.g. use of tests) through the activation of validity issues (e.g. inappropriate or inadequate score-based decisions).
IELTS for economic and political purposes?
Whichever area the test taker had a lower band score should only be the area to re-take. Because the test is so costly, I perceived it to be income generation by the organisation and making it hard and tough for skilled professionals who have higher qualifications than the lazy English speaking natives here in Australia.
‘a money-producing machine for the native countries, i.e. the UK and Australia’ (R2),
‘the IELTS test is one of the main way [s] to make money for the test producer’ (R138)10
‘Although the test aims to assess English proficiency, the main objective is [to] make money’ (R163)
‘This test should not be used for the purpose of making money and profit’. (R190)
Despite this, I was forced to take the IELTS test as part of my application for a skilled migration visa to Australia. I think this is madness, and a waste of money. I do think that in this case it is purely a money making exercise and nothing else. (R302)
After doing the test a couple of times I have just realised what a biggest scam the Ielts [sic] is. They are making it extremely hard to pass so that they can rip off money from the students. (R208)
For example, some international students who already prove their language skills to their educational institutions still have to take it because it’s a requirement of the dept [Department] of immigration. (R401)
It does have minor flaws, however if anyone asks me to name the best possible English test, I would not hesitate to recommend IELTS. Unfortunately, IELTS is currently being abused by the system/government/immigration to put too much burden on non-native speakers, such as setting the bar a little too high for them. Asking overall and/or each band score(s) of 7.0 seems to be unreasonable. (R398)
IELTS is quite a paradox. Certain governments have used it as a requirement for migration and educational purposes. Yet, if they were to apply the same test on their own citizens, their own citizens would not be able to pass these tests. (R311)
I know I am international student but [...] there are a lot of locals who couldn't even spell properly, how come they are expecting us to have the perfect English wherein [sic] some of them cannot even spell properly. Will they pass the IELTS?
By requiring ‘native’ speakers to take IELTS for immigration purposes,13 some sort of fairness seems to have been established (even as this was contested by ‘native’ English speakers, as indicated earlier). However, test-takers, who were yet to be convinced of many aspects of IELTS policy and its use for overly commercial motives, rallied against what they perceived as the injustice of the IELTS process by pointing to a community of reference within the TLU society whom they believed to be less proficient in English than themselves. Test-takers’ insights strongly indicated that their level of proficiency should be more fully taken into consideration in defining the IELTS construct, establishing its purposes and evaluating its performance standards.
Discussion and conclusions
Investigating how Global English tests such as IELTS impact test-takers’ lives and global mobility and how test-takers perceive the processes and outcomes of test design, administration and use are more than an academic exercise; it is an imperative for social justice, not only between groups of test-takers but also between test-takers and test-owners/test-users who are locked in a relationship of inequality (Deygers, 2017). While the commercial viability of tests, the pragmatics of testing and the limits of testability including test qualities (Bachman & Palmer, 1996) need to be appreciated, the science and technology of tests and professional standards alone may not be adequate for guiding test design, administration and use (Hamid, 2016). Critical reflection on the operation and the intent of IELTS points to the complexity of fairness, justice and validity, as exemplified in the present study. To date, test-takers’ perceptions of IELTS, in relation to a broader, more encompassing conceptions of fairness, justice and validity have not been given adequate attention in the literature. This article provides an example of engagement with test-takers in the hope that test-takers’ experiences and perspectives will be more fully taken into consideration in test design, administration and use (Dimova, 2012; Green & Andrade, 2010).
On a narrower rendering of issues of fairness, the test-takers were critical of IELTS even as a large proportion initially indicated, from their perspective, that the test was ‘fair’ in the way in which it sought for all participants to take exactly the same test. This ‘sameness’ refers to procedural fairness which was seen as ensuring a level of equivalence that could not otherwise be achieved. However, and, at the same time, an overwhelming proportion expressed the view that the test did not provide an accurate measure of their proficiency, raising questions about the reliability of scores and about fairness and validity as a consequence. Several explanations emerged from the data for the perception that the test did not test participants’ actual language proficiency. First, personal experiences of test repetition showed substantial variations in scores across test sittings. Secondly, test-takers’ experiences of engaging in English in the host society enabled them to understand that higher scores did not necessarily mean higher levels of performance in language use contexts and vice versa. Thirdly, they believed that the test did not guarantee comparative validity, meaning that two test-takers with the same scores did not necessarily demonstrate the same level of English proficiency. Such views may reveal the ‘potential and real injustice of tests’ (McNamara & Ryan, 2011, p. 165), and how test-takers felt that they had to navigate these vicissitudes, even as they appeared to make little sense to them. Test-takers’ self-reported experiences reflected the gap between testing in theory (as reflected in scores), and their actual proficiency, and real-world language use.
Most significantly, participants’ (subjective) responses reveal how a broader conception of justice is restricted by the testing practices and processes associated with IELTS. While some participants argued that a sense of procedural fairness was evident in the way the test was constructed, and how it was enacted, others were critical of what they construed as a set of practices designed for external purposes—namely to generate profits from an international student market, and to serve as a potential barrier to access, and restriction upon immigration. Notions of beneficence guiding Kunnan’s (2008) test fairness framework, with its emphasis upon how a ‘test ought to bring about good in society’, and that ‘it should not be harmful or detrimental to society’ (p. 14), were sorely tested by an evaluation framework that seemed to be more driven by extraneous motives of profit, as perceived by the test-takers, than by efforts to foster the sorts of cosmopolitan, harmonious and diverse societies that could be cultivated through providing opportunities for respectful engagement with individuals and groups from rich and varied cultural backgrounds. Test-takers reported that they experienced a lack of fairness and a sense of injustice that seemed to problematize the technical excellence supposedly associated with the ‘validity’ of IELTS.
Reflecting McNamara and Ryan’s (2011) conception of justice, test-takers’ responses indicated a broader conceptualization of justice than simply a technicist focus upon issues of equality as ‘sameness’ in relation to testing alone. Such responses also reflect how processes of recognizing difference in robust, activist ways (Fraser, 1997) was severely lacking, and that the tests seem to have been ‘stacked against’ them, even as participants simultaneously perceived that the test played a crucial role in their pursuit of opportunities in the countries in which they studied, and to which many may have wished to move upon completion of their studies. Participants’ responses also revealed that ultimately it was test-owners who were the real beneficiaries of the testing process. As Chik and Besser (2011) indicated with reference to similar international tests for young English learners, profit maximization was considered to be behind the test-retake policy of IELTS which was also, as some test-takers believed (or perhaps misbelieved), deliberately made difficult for the purpose of maximizing profits. The psychological, social and economic costs that they reported they incurred in taking IELTS also indicated validity concerns, particularly when participants expressed reservations about ‘native’ English speakers’ capacity to succeed in the test. Participants considered it unjust to impose a level of proficiency on ESL speakers which was less than evident within the TLU community more broadly. Test-takers may have blurred the boundaries of the domains of testers and test-users, but it may be testers’ responsibility to develop test literacy and awareness among test-takers. Testing agencies may also need to improve test-takers’ limited understanding of the interdependence of fairness, validity and justice—specifically, to make them understand how achieving a reasonable balance of the three concepts may require a principled compromise of each.
While more research involving larger samples of test-takers using varied methods of data collection is needed beyond participants’ perceptions, these powerful insights (however subjective) deserve greater attention, and may help inform actions on the part of IELTS providers, governments, and testing researchers. For IELTS authorities in particular, the research reveals a need to consider and explain and perhaps reconsider (1) the rationale behind the test, including the period of validity; (2) the allowable error margins; (3) technicist notions of comparative validity; and (4) how to enhance assessment literacy among test-takers. It is also necessary to rethink the re-take policy and establish ways of communicating with test-takers to address perceived concerns about IELTS and its broader goals.
The research revealed the complexity and problematics that attend standardized English test-taking processes such as those associated with IELTS, and how such tests can be seen as part of broader processes of homogenizing, global English testing practices. It reinforces the need to sustain critical engagement with the nature of such practices more generally. The ways in which test-takers perceived and experienced the fairness and validity of the tests, and the extent to which these experiences reinforced and challenged notions of justice, revealed such testing practices as requiring much closer scrutiny, particularly in relation to both motives and processes that drive their use in a globalizing world.
IDP is owned by 38 Australian universities and the job website SEEK. As well as testing services, it provides international student placement services. See www.australia.idp.com.
The IELTS website (https://admissiontestportal.com/en/pages/2-about-ielts/7-what-is-ielts/) notes that the ‘IELTS test is trusted by over 9,000 organizations’.
The website indicates: ‘All standard varieties of native-speaker English, including North American, British, Australian and New Zealand English are accepted’ (https://www.ielts.org/en/what-is-ielts/ielts-introduction).
For example, the website claims that IELTS ‘works around the world’ and ‘opens doors’ (https://www.idp.com/global/ielts).
Ethical clearance for the research was obtained from this and another university in the same city.
- 6.The following prompt was used for the open-ended comments:
‘Do you have anything else to say about the test? Would you suggest any changes to the test? Please make your response as long as you like.’
That is, manual coding of the data found 51 references to positive evaluations of the test.
Respondent codes such as these are used throughout to ensure anonymity.
The re-take policy is described as follows in the Information for Candidates document on the IELTS website: ‘There are no restrictions on re-taking IELTS. If you do not get the result you wanted, you can register for another test as soon as you feel you are ready to do so’. When re-taking IELTS, candidates have to redo all four components, including the ones in which they had satisfactory results from a previous sitting.
Test producers, test owners and test users (e.g. Australian universities) are often conflated in the test-takers’ responses. Although the IELTS website makes the joint ownership of the test clear, because these owners are British and Australian institutions, often test-takers referred to these two countries, in place of these specific institutions.
The relevant policy is mentioned in the Information for Candidates on the website, although no reason is provided for 2 years. It is also unclear whether this policy is needed for the IELTS partners or the test users (see Hamid et al., 2019).
- 12.For example, the Australian Council of TESOL Associations in its formal submission to the Senate Legal and Constitutional Affairs Committee regarding the Australian Citizenship Legislation Amendment Bill 2017 noted:
there is no evidence that the English language skills of permanent residents decline over time. There is credible evidence to the contrary, not least from the 2011 ASRG [Australian Survey Research Group] report, which was commissioned by the then Immigration Department. (http://www.tesol.org.au/files/files/577_Sub292_-_ACTA_sub_to_Citizenship_Inquiry_July_2017.pdf)
However, it has also been reported that some international students do not improve English proficiency even after their graduation from Australian universities (see Burton-Bradley, 2018).
The new points system for skilled migration to Australia provides points for English proficiency for both ‘native’ and ‘non-native’ speakers of English.
The authors would like to acknowledge the support received from Dr. Ngoc Hoang who worked as a Research Assistant in the project. Specifically for this paper, we would like to acknowledge her verification of the coding. Permission has been obtained from her for this acknowledgment. We are also thankful to colleagues and anonymous reviewers who provided helpful feedback on earlier versions of the article.
MOH conducted the research, collected the data (with assistance from a Research Assistant), analysed the data and produced the initial draft of the article (55%). IH reanalysed the data and contributed to the framing of the study and rewrote some sections of the paper (25%). VR contributed to the analysis and to rewriting some sections (20%). All authors read and approved the final manuscript.
The research project that the paper is drawn from was funded by the University of Queensland, Australia. The fund was part of the University’s New Staff Start-up Grant which was approved for the first author of the paper.
Availability of data and materials
Data that have been used in this paper cannot be shared because permission has not been sought from the participants for this data sharing through the informed consent process.
M. Obaidul Hamid (Orcid: 0000-0003-3205-6124) is Senior Lecturer in the School of Education, The University of Queensland. Dr. Hamid’s research focuses on the policy and practice of TESOL education in Asia. He is co-editor of Language planning for medium of instruction in Asia (Routledge; 2014).
Ian Hardy (Orcid: 0000-0002-8124-8766) is Associate Professor in the School of Education, The University of Queensland. His research focuses on the politics of educational policy and practice, with particular attention to the nature of teachers’ work and learning. He is author of The politics of teacher professional development: Policy, research and practice (Routledge; 2012).
Vicente Reyes (Orcid: 0000-0002-1539-1839) is Senior Lecturer in the School of Education, The University of Queensland. His research focuses on educational transformations, technology innovations in education and comparative education reform. His latest book is entitled Mapping the Terrain of Education Reform: Global Trends and Local Responses in the Philippines (Routledge; 2016).
The authors declare that they have no competing interests.
- AERA, APA, & NCME. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association, American Psychological Association and National Council on Measurement in Education.Google Scholar
- Ahearn, S. (2009). ‘Like cars or breakfast cereal’: IELTS and the trade in education and immigration. TESOL in Context, 19(1), 39–51.Google Scholar
- Bachman, L. F., & Palmer, A. S. (1996). Language testing in practice. Oxford: Oxford University Press.Google Scholar
- Bhaskar, R. (1986). Scientific realism and human emancipation. London: Verso.Google Scholar
- Bourdieu, P., & Wacquant, L. (1992). An invitation to reflexive sociology. Chicago: The University of Chicago Press.Google Scholar
- Burton-Bradley, R. (2018). Poor English, few jobs: Are Australian universities using international students as ‘cash cows’? Australian Broadcasting Corporation Retrieved from https://www.abc.net.au/news/2018-11-25/poor-english-no-jobs-little-support-international-students/10513590.
- Capstick, T. (2011). Language and migration: The social and economic benefits of learning English in Pakistan. In H. Coleman (Ed.), Dreams and realities: Developing countries and the English language (pp. 1–23). London: British Council.Google Scholar
- Davidson, F. (1993). Testing English across cultures: Summary and comments. World Englishes, 12(1), 113–125.Google Scholar
- Dimova, S. (2012). Matura’s rocky road to success: Coping with test validity issues. In I. Csépes & D. Tsagari (Eds.), Collaboration in language testing and assessment (pp. 143–157). Frankfurt: Peter Lang Verlag.Google Scholar
- Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. Oxford: Oxford University Press.Google Scholar
- Fraser, N. (1997). Justice interruptus: Critical reflections on the ‘postsocialist’ condition. New York and London: Routledge.Google Scholar
- Hamid, M. O. (2014). World Englishes in international proficiency tests. World Englishes, 33(2), 263–277.Google Scholar
- Hamid, M. O. (2016). Policies of global English tests: Testtakers’ perspectives on the IELTS retake policy. Discourse, 37(3), 472–487.Google Scholar
- Hamid, M. O., Hoang, N. T. H., & Kirkpatrick, A. (2019). Language tests, linguistic gatekeeping and global mobility. Current Issues in Language Planning, 20(3), 226–244.Google Scholar
- Hawkey, R. (2006). Impact theory and practice: Studies of the IELTS Progetto Lingue 2000. Cambridge: UCLES/Cambridge University Press.Google Scholar
- Hawkey, R. (2008). An impact study of a high-stakes test (IELTS): Lessons for test validation and linguistic diversity. In L. Taylor & C. J. Weir (Eds.), Multilingualism and assessment: Achieving transparency, assuring quality, sustaining diversity (pp. 215–228). Cambridge: UCLES/Cambridge University Press.Google Scholar
- Hoang, N. T. H., & Hamid, M. O. (2017). ‘A fair go for all?’ Australia’s language-in-migration policy. Discourse, 38(6), 836–850.Google Scholar
- Kirkpatrick, A. (Ed.). (2010). The Routledge handbook of world Englishes. New York: Routledge.Google Scholar
- Kunnan, A. (2000). Fairness and justice for all. In A. Kunnan (Ed.), Fairness and validation in language assessment (pp. 1–14). Cambridge: UCLES/Cambridge University Press.Google Scholar
- Kunnan, A. (2004). Test fairness. In M. Milanovic & C. J. Weir (Eds.), European year of languages conference papers, Barcelona (pp. 27–48). Cambridge: Cambridge University Press.Google Scholar
- Kunnan, A. (2008). Towards a model of test evaluation: Using the test fairness and test context frameworks. In L. Taylor & C. J. Weir (Eds.), Multilingualism and assessment: Achieving transparency, assuring quality, sustaining diversity (pp. 229–251). Cambridge: Cambridge University Press.Google Scholar
- Kunnan, A. (2014). Fairness and justice in language assessment. In A. J. Kunnan (Ed.), The companion to language assessment (pp. 1–17). New York: Wiley.Google Scholar
- McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Malden: Blackwell.Google Scholar
- Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (pp. 13–103). New York: Macmillan.Google Scholar
- O’Sullivan, B., & Green, A. (2011). Test taker characteristics. In L. Taylor (Ed.), Examining speaking: Research and practice in assessing second language speaking (pp. 36–64). Cambridge: UCLES/Cambridge University Press.Google Scholar
- Rawls, J. (2001). Justice as fairness: A restatement (E. Kelly, Ed.). Cambridge, MA: Harvard University Press.Google Scholar
- Sen, A. (2010). The idea of justice. London: Penguin.Google Scholar
- Shohamy, E. (2001). The power of tests: A critical perspective on the uses of language tests. Harlow, New York: Longman.Google Scholar
- Taylor, L. (2010). Setting language standard for teaching and assessment: A matter of principle, politics, or prejudice? In L. Taylor & C. J. Weir (Eds.), Language testing matters: Investigating the wider social and educational impact of assessment (pp. 139–157). Cambridge: UCLES/Cambridge University Press.Google Scholar
- Templer, B. (2004). High-stakes tests as high fees: Notes and queries on the international English assessment market. Journal for Critical Education Policy Studies, 2(1). Available at: http://www.jceps.com/archives/414.
- van der Heijden, J. (2013). Testing skilled migrants’ English: Ridiculous and insulting (p. 5989). Independent Australia. Retrived from https://independentaustralia.net/australia/australia-display/testing-skilled-migrants-english-ridiculous-and-insulting,5989. 14 Dec 2013.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.