A mixed method study of ethical issues in classroom assessment in Chinese higher education

Abstract

The purpose of this study was to explore Chinese university professors’ decisions about ethical issues in classroom assessment. A survey with fifteen scenarios that describe professors’ thoughts about ethics in assessment practices was administrated to 555 professors from 143 colleges and universities in 29 provinces in China. The results of the quantitative analysis indicated that professors’ interest in professional development related to classroom assessment, and their dispositions were significantly associated with their agreement with experts in the field of classroom assessment. Professors’ gender, highest degree, professional rank, and years of teaching experience did not significantly predict their agreement. The qualitative analysis revealed that maintaining fair assessment vs being caring to students and asserting professors’ rights vs abiding by university policy were the crucial aspects for professors to consider in classroom assessment. Findings of the study could help educators identify ethical issues in assessment, develop guidelines to ensure fair assessment, and incorporate differentiated strategies in professional development workshops in higher education.

Introduction

When asked to explain whether a classroom assessment practice was ethical or unethical, a Chinese university professor described his thinking thusly:

It is an ethical practice for the professor to let students grade each other’s test papers and share the results in groups. Students should not only know how to do the questions in the test papers, they should also learn the grading process. This can help build their autonomous learning, make them reflect from other students’ mistakes, and learn to perform better by understanding rubrics through grading.

This example captures one Chinese university professor’s explanation about his decision on the ethics regarding a classroom assessment practice in which a professor has students grade each other’s papers and then share the results in groups. The scenario described was related to the issue of confidentiality in assessment practice. In this instance, the professor failed to recognize confidentiality in grading and seemed to be unaware of the students’ right to privacy (i.e., confidentiality) in assessment. This example illustrates a need for researchers to explore ethical issues in assessment practices in Chinese higher education.

Ethics in education have been studied widely in the United States (Campbell 2013; Glanzer and Ream 2007; Warnick and Silverman 2011), and many other countries including the United Kingdom (O’Leary 2008), Australia (Boon 2011), Canada (Tierney 2013), China (Fan et al. 2017), Turkey (Özbek 2013), and South Africa (Beets 2012). Ethical consideration was infused in educational assessment, in which ethics address the rules of behavior or norms that should govern educators’ assessment practices (Johnson et al. 2008). The importance of ethics in student evaluation was also emphasized by Quesenberry et al. (2012), Pope (2006), Popham (2000), Chappuis et al. (2012), and Ryan (1997). Particularly, Ryan (1997) associated student evaluation with high stakes decisions, suggesting that student assessment practice should take morality and ethics into consideration.

Regarding ethical issues in classroom assessment, some researchers focused on the issues of confidentiality (Tirrir 1999) and standardized testing (Haladyna et al. 1991; Mehrens and Kaminski 1989; Popham 1991). Other researchers examined the perceptions of educators about ethical issues in assessment practices. In a survey study, Green et al. (2007) found that respondents had strong agreement on fewer than half of the scenarios on ethical issues. Johnson et al. (2008) reported educators’ strong agreement on half of the ethical scenarios. Similarly, researchers (Fan et al. 2019; Liu et al. 2016) compared the views of pre-service teachers in China and the U.S. and indicated some differences across the scenarios. A study (Fan et al. 2017) described in depth Chinese professors’ perceptions of ethical issues in classroom assessment. The authors stated that ethical issues in classroom assessment were universal, and professors seemed to have low agreement with experts in communications about grading, confidentiality, and grading practice.

The aforementioned studies suggest a need for continued in-depth investigation in a variety of social cultural contexts. Stakeholders (e.g., teachers, students, administrators, parents) may have different expectations and views of ethical issues in classroom assessment. Previous studies explored the views of pre-service teachers (e.g., Fan et al. 2019; Liu et al. 2016), in-service teachers (Green et al. 2007), and administrators including principals and principal candidates (Johnson et al. 2008). Classroom assessment in higher education is important, but little research has been conducted to explore the ethical issues in assessment practices. For instance, one common practice that professors share students’ grades was unethical. In addition, little research has been identified to examine the factors contributing to professors’ ethical reasoning in the context of higher education. The current study focused on ethical reasoning and factors that might contribute to professors’ decisions about the ethics of assessment practices. The present study is significant in that it utilized both quantitative and qualitative methods (i.e., a mixed-methods study) for collecting sufficient evidence to explore ethical issues, which would be useful in helping build guidelines in educational assessment and inform professional development (PD). Confucius, a well-known educator and philosopher in ancient China stated that “Moral force never dwells in solitude; it will always bring neighbors” (Waley, Trans. Analects of Confucius 1989, p. 106). We interpret this quote as indicating that if we do as we ought to in our daily lives, such as appropriately assessing our students, then our colleagues will follow our lead in beginning to examine their classroom assessment practices.

To better understand factors that impact professors’ perceptions about ethical issues in classroom assessment practices, we included factors such as professors’ gender, highest degree, years of teaching experience, and their professional ranks. We were also interested in exploring the impact of professors’ professional dispositions (e.g., professional attitudes, values, and beliefs) and their interest in professional development related to assessment on their perceptions about ethical issues in assessment practices. We included gender in our analysis because traditional Chinese culture placed males in a dominant and superior role, but females in a subservient role. For example, women were treated as indoor housekeepers, being obedient to father, husband, and son. In modern Chinese society, there is still gender inequity despite the establishment of women’s right. Males are more likely to have dominant roles than females in decision making, management, and leadership. Studies (Liu 2007; Liu and Li 2010) reported clear gender differences in terms of educational level, regions, and administrative area, with males having a better chance of entering higher education especially in pursuing graduate studies. In this study, we intended to explore whether male professors and female professors have different perceptions of ethical issues in classroom assessment in higher education.

We included professors’ highest degree, years of teaching experience, and professional ranks because researchers (e.g., Coburn 2005; Spillane 1999) found that an educator’s perception of a new policy is influenced by his or her pre-existing knowledge and worldview. We intended to investigate the impact of professors’ educational experiences (e.g., academic degree, teaching experience, and professional ranks) on their views of ethical issues in classroom assessment. Traditionally, universities involve professors who have higher degrees, higher professional ranks, and longer years of teaching experience in mentoring their new faculty members. We embrace the idea that this mentoring helps new faculty members in classroom instruction and planning and designing instruction. However, little research has been conducted to confirm whether this type of mentoring can be employed in terms of ethical issues in assessment. We would like to examine the impact of professors’ degrees, professional ranks, and years of teaching experience on their perceptions of ethical issues in classroom assessment. This is especially important in making choices in professional development and new faculty mentoring regarding ethical issues in assessment.

In addition, professors’ dispositions and their interest in professional development related to assessment were also of our interest. We assumed that professors who are interested in PD might pay more attention to ethical issues in assessment, be more likely to read and discuss ethical issues in assessment, and have a higher agreement with experts in classroom assessment. Regarding dispositions, an accreditation institute for teacher education programs in the U.S., the National Council for Accreditation of Teacher Education (NCATE), defined dispositions as “professional attitudes, values, and beliefs demonstrated through both verbal and non-verbal behaviors as educators interact with students, families, colleagues, and communities” (NCATE 2008, pp. 89–90). Some studies have reported positive relationships between teacher dispositions and effective teaching (e.g., Richardson and Onwuegbuzie 2004). The NCATE (2008) described two professional dispositions that institutions are expected to assess: fairness and the belief that all students can learn.

Based on the literature reviewed, we considered three dispositional traits in our analysis: fairness, flexibility and caring, and degree of strictness. We examined whether an association exists between professors’ dispositions and their agreement with experts in terms of ethical issues in classroom assessment. Understanding professors’ professional dispositions is informative in PD. Professors who have different dispositions might perceive ethical issues in assessment differently. It is important for the PD workshop leaders to be aware of the differences and adopt differentiated methods for the professors of different dispositions. For example, in PD group discussion, the workshop leaders should make sure each discussion group should include professors of different dispositions to facilitate meaningful discussion. In addition, understanding professors’ dispositions will help develop and design the PD sessions to meet the needs of all professors.

We believe assessment norms and principles should be universal and applied to a culturally diverse educational background. Educators’ views of ethical issues in classroom assessment are strongly influenced by the cultural norms and ideology and their pedagogical knowledge (Fan et al. 2019). Professors’ views of ethical issues might be impacted by Chinese intellectual traditions (e.g., Confucianism) and the test-oriented instruction. We believe what constitutes ethical behavior in educational assessment involves some range of agreement among educators. For some classroom assessment practices, there is unanimity of decisions about the ethical implications. However, for other assessment practices, educators may not agree on the ethical implications. The goal of this study was to move towards consensus and consistency in behavior based on agreement and understanding. We anticipate the findings could inform decision making in assessment practices and ensure fair assessment through trainings and reflections on ethics.

In the current study, we assumed that Chinese university professors hold various views on the 15 scenarios aligned with the six categories in classroom assessment. They have consistent views with experts on some scenarios and split views with experts on other scenarios. We assumed that Chinese university professors’ gender, highest degree, professional rank, years of teaching experience, interest in professional development, and dispositions impact their decisions about assessment practice. Professors who have higher academic degree and professional rank, longer years of experience in education should have more consistent views with experts and make appropriate decisions about ethical issues in classroom assessment. We also assumed that professors might face some dilemmas in making decisions about ethical issues in classroom assessment.

Theoretical framework and research questions

The current study drew on both theoretical and empirical foundations in professional ethics and educational assessment. Ethics can be traced back to the teachings of Socrates (Plato 2009) who described ethics as “how we ought to live” (p. 352d). Similarly, in more recent times, Strike and Soltis (1992) wrote, “When someone behaves in a way that is different from how people ought to behave, an ethical standard is violated” (p. 6). Ethics are generally referred to as the rules of behavior or practices that a profession imposes on itself (Sax 1974; Thorndike et al. 1991), and professional ethics are related to the principles of conduct that people use to guide their behaviors and actions in their professional field (Brandt and Rose 2004). In education, professional ethics describe the “norms, values, and principles that should govern the conduct of educational professionals” (Husu 2001, p. 68).

This study was also grounded in the theory of educational assessment. Three major sources of guidelines in assessment were employed in making decisions about whether some assessment practices are ethical or unethical practices. The Joint Committee on Standards for Educational Evaluation (JCSEE 2003) published the Student Evaluation Standards used for evaluating students, where it states that assessment practices should be “ethical, fair, useful, feasible, and accurate” (p. 3). Another guideline was proposed by Taylor and Nolen (2005), who recommended that educators should “Do No Harm” when assessing students (p. 7). Then, Green et al. (2007) suggested “Avoid score pollution” be applied as an ethical principle in assessment practices. Score pollution happens when grades or test scores do not accurately reflect students’ mastery of content. Although a polluted score might do harm to the students, we elect not to collapse the two principles into Do No Harm because recognizing score pollution as an ethical issue provides educators with greater discernment between the ethicality of behaviors and actions.

Ethical issues in classroom assessment have been discussed broadly. Students’ grades should reflect their learning, and students should be informed of the activities to be considered in their final grades (Ory and Ryan 1993). Students’ rights of privacy should be protected (Brookhart and Nitko 2008), and test results should not be revealed to anyone who does not have a legitimate need to know the scores (JCSEE 2003; Worthen et al. 1998). Teachers should minimize the effect on scoring of factors (e.g., student ability, effort, attendance, and attitude) that are irrelevant to the purposes of the assessment (Brookhart and Nitko 2008; Oosterhof 2009). Multiple assessment methods should be used, and no single task or test can accurately or adequately measure an important learning outcome (Gronlund 2003; Wiggins 1994). Test administration should be fair to all test takers (Brookhart and Nitko 2008), and teachers should avoid intervention in test administration (Sax 1974).

The current study was intended to investigate Chinese university professors’ perceptions of classroom assessment practices. The instrument of this study was a survey with fifteen scenarios that were intended to examine professors’ decisions about ethics in assessment practices, and their justifications of their decisions. We employed a mixed method, and both quantitative and qualitative analyses were conducted. We assumed that professors from various personal and educational backgrounds might report different views of the assessment practices. We anticipate that the findings of this study could help educators identify ethical issues in assessment and develop guidelines to ensure fair assessment. The current study was intended to address the following questions:

  1. 1.

    What are the differences of professors’ perceptions of ethical issues in classroom assessment practices based on their gender, highest degree, professional rank, years of teaching experience, interest in PD, and their professional disposition?

  2. 2.

    What are the factors that impact professors’ agreement scores with experts on ethical issues in assessment practices?

  3. 3.

    What is the reasoning for professors’ decisions on ethical issues in assessment practices?

Method

Participants

Participants included 555 professors from 143 colleges and universities located in 29 provinces in China. About 30% of the professors taught in the field of STEM (science, technology, engineering, mathematics), 63% taught in the field of humanities and social sciences, and 7% taught in other areas (e.g., agriculture, mining). Approximately one-third (33.9%) of the participants were males and two-thirds (66.1%) were females. About 42% were either professors or associate professors, and 58% were assistant professors or lecturers. About 36% of the professors had a doctoral degree, 59% had a master’s degree, and 5% had a bachelor’s degree. Approximately 60% of participants had more than 10 years of teaching experience. In addition to demographic information, the participants were also asked to define their dispositions in teaching by selecting one of the three descriptions, and whether they were interested in participating in professional development related to classroom assessment. About half (50.5%) of the participants self-defined themselves as being fair and treated all students equally, one-third (33.5%) defined themselves as being caring and flexible with rules, and 16.0% stated they were strict disciplinarians. Furthermore, approximately two-thirds (64.4%) of the respondents expressed that they were interested in participating in professional development related to classroom assessment.

Instrument

For the current study, we developed a survey consisting of fifteen scenarios about ethical issues in classroom assessment. The fifteen scenarios aligned with six categories (Bias/Fairness, Communication about Grading, Confidentiality, Grading Practices, Multiple Assessment Opportunities, and Test Administration) that describe professors’ thoughts about ethics in classroom assessment practices (see “Appendix”). The fifteen scenarios were originally developed by Green et al. (2007), Johnson et al. (2008), and Liu et al. (2016). The original scenarios were in English and mainly used for K-12 education, and the participants consisted of pre-service teachers, in-service teachers, and educational leaders in elementary, middle, and high schools. The current study was targeted at university professors. We revised the scenarios, particularly the use of language, to make them more applicable to classroom assessment in Chinese higher education. For example, we changed “an elementary school teacher” used in previous surveys to “a professor” in the current survey. Another revision was about the categories in the instrument. The previous surveys included seven categories; however, we excluded the category Standardized Test Preparation in the current survey considering the standardized test preparation was not a common assessment practice in Chinese higher education.

We included different numbers of scenarios in different categories due to the frequency of ethical issues in assessment (see Table 1). For example, grading practice has the largest number of scenarios because of a high frequency of ethical issues in this category. Our decision was consistent with those in previous studies where grading practice included the largest numbers of scenarios (Green et al. 2007; Johnson et al. 2008; Liu et al. 2016). We translated the scenarios into Chinese. To support the content validity of the instrument, we invited six university professors who had expertise in English–Chinese language translation, higher education, and classroom assessment to review the instrument. Their reviews focused on the accuracy of translation from English and Chinese, language bias, contextual meaning of each scenario, and relevance of scenario to Chinese higher education.

Table 1 Expert view and ethicality of scenarios

To guide us in making decisions about assessment practices (ethical/unethical), we adopted the method of expert view. We reviewed multiple textbooks, journal articles, and teaching standards related to classroom assessment (Table 1). We made decisions about the ethicality of the scenarios based on the reviewed literature, which was considered as “expert view.” For example, according to JCSEE (2003), Ory and Ryan (1993), and Oosterhof (2009), student evaluation should be ethical, fair, useful, and feasible, and unrelated factors (e.g., efforts, growth, attendance) should not be incorporated into grading. Therefore, using the “expert view” from the literature, we made decisions that Scenario 10 (A professor who knows a student had a bad week because of problems at home bumps the student's participation grade up a few points to compensate for his bad score on a quiz.) was an unethical assessment practice. In the study, we asked participants to select “ethical” or “unethical” on each scenario. When participants had views consistent with experts, they obtained one point. Otherwise, they obtained zero point. Table 1 describes expert view by reviewing selected assessment literature within each category. It also indicates the ethicality on each scenario based on the guidelines from the literature reviewed.

Data collection

To recruit study participants, we contacted colleagues and former classmates who work as university professors. In turn, they made connections with their colleagues and teaching faculty and requested that they participate in the survey. We sent each professor a SurveyMonkey link by email. The email message explained the purpose of the survey and provided a clickable button to begin the on-line survey. Respondents were required to select whether the statement described in each scenario was ethical/unethical and then to explain their reasoning in a text box. Respondents were also asked to provide their demographic information including gender, highest degree, professional rank, years of teaching experience, interest in professional development, and their dispositions.

Data analysis

We employed a mixed method of both quantitative and qualitative analyses to investigate the factors contributing to professors’ perceptions about ethical issues in assessment practices. SPSS was used to analyze the quantitative data. Descriptive statistics were computed, and professors’ agreement scores were reported in each category to present an overview about respondents’ agreement with experts in each category. Respondents’ agreement score was calculated based on their agreement with experts. When a professor agreed with experts on one scenario, he/she obtained one point. Suppose a respondent obtained a score of 11, it means this respondent agreed with experts on the ethics of practice depicted in 11 scenarios. The agreement scores ranged from 0 to 15 points, with higher points indicating higher agreement with experts.

We reported the percentage of professors who agreed with experts on each scenario within each category to illustrate their overall agreement (see “Appendix”). To investigate the factors that might impact professors’ agreement scores, we described professors’ agreement scores based on their gender, highest degree, professional rank, years of teaching experience, their interest in professional development, and their dispositions.

To investigate the factors that might impact the agreement scores that the respondents have with experts, we conducted a multiple regression analysis. Professors’ gender, highest degree, professional rank, years of teaching experience, their interest in professional development, and their dispositions were independent variables, and respondents’ agreement score was the dependent variable. Before conducting the multiple regression analysis, we examined the assumptions including identifying influential cases using Cook’s D, testing normality of the residuals using P–P plots, and examining the independence of residuals using Durbin–Watson test. In multiple regression analysis, we dummy coded the independent variables due to their characteristics of being categorical. For gender, we coded female as 0 and male as 1. For the highest degree, we used Ph.D. as a reference variable, and coded the degree of bachelor and master as 1 separately. For professional rank, we coded assistant professor or lecturer as 0, and associate professor or full professor as 1. For years of experience, we coded 10 years or less as 0, and more than 10 years as 1. For professors’ interest in PD, having interest was coded as 1, and having no interest was coded as 0. For professors’ disposition, we used being fair as a reference variable, and coded being easy and being strict as 1 separately.

In addition to using quantitative method, we also utilized in vivo coding method to conduct the qualitative analysis. We coded respondents’ written explanations for each scenario using the vivo coding. In vivo coding focuses on the actual words used by the participants, and we selected this method to highlight the voices of the participants (Miles et al. 2014). Our participants were university professors in China. We believe their perceptions of ethical issues in classroom assessment are shaped by their social, educational, and cultural experience. In vivo coding allows us to capture participants’ views from a culturally situated context. For example, one professor selected “unethical” in response to Scenario 8 about considering student effort in grading, and he explained, “Professors should treat all test takers in a fair way, and effort should not be considered in grading.” We used the respondent’s word “fair” as a code. We reviewed the codes for patterns, and finally constructed themes based on the patterns of the codes across categories of the scenarios. To ensure the reliability of coding, two authors of this study, who were trained for coding qualitative data, coded separately, and then discussed and reached an agreement when their opinions differed.

Results

In the following section, we use both quantitative and qualitative methods to analyze professors’ responses. We first describe the percentage of professors who had consistent views with experts on the scenarios in each category (see “Appendix”). Then we present respondents’ mean agreement scores in each category (e.g., Grading Practices, Multiple Assessment Opportunities). To explore the factors that might impact professors’ agreement with experts, we describe professors’ agreement scores based on their gender, highest degree, professional rank, years of teaching experience, interest in professional development, and dispositions. Finally, we use multiple regression analysis to examine the factors that might predict professors’ agreement scores. Furthermore, we also utilize qualitative methods to analyze respondents’ explanations and justifications for their decisions on ethicality. The number of survey participants who provided explanations and justifications to each scenario ranged from 198 to 288.

Quantitative analysis

We calculated the percentages of professors who agreed with experts on the ethicality of each scenario (“Appendix”) to understand their overall agreement with experts. It appears that professors had views consistent with experts on the scenarios in Multiple Assessment Opportunities. However, they had split views (consistent views on some scenarios and inconsistent views on other scenarios) with experts in Bias/Fairness, Communication about Grading, Confidentiality, Grading Practice, and Test Administration.

We calculated the means of professors’ agreement scores (Table 2). The overall agreement scores range from 0 to 15 points, with higher points indicating higher agreement with experts. The mean is 8.34, indicating professors agreed with more than half of the scenarios on average. We also calculated the mean for each scenario. The agreement scores for each scenario range from 0 to 1 point, and a higher score point indicates that professors have higher agreement with experts. Professors seemed to have high agreement with experts on Scenarios 3, 4, 11, and 13, and low agreement with experts on Scenarios 2, 5, and 15.

Table 2 Descriptive Statistics of the Scenarios

We calculated the mean agreement scores in each category (Table 3). Since there are different numbers of scenarios in each category, the maximum possible score points are different for each category. For example, the average agreement score is 1.1 for the category of Fairness/Bias in which there are 2 scenarios and the maximum possible score is 2 points. It shows that professors had most consistent views with experts in Multiple Assessment Opportunities, and the least consistent views with experts in Communication about Grading and Confidentiality.

Table 3 Mean agreement scores within each category

To explore whether professors’ gender, highest degree, professional rank, years of teaching experience, interest in professional development, and dispositions have an impact on their agreement with experts, we described their agreement scores based on these factors. Table 4 describes the means of professors’ agreement scores. It appears that male professors obtained a slightly higher agreement score (\(\bar{x}\) = 8.38) than female professors (\(\bar{x}\) = 8.30). The professors who had a degree of Ph.D. had slightly higher agreement scores (\(\bar{x}\) = 8.36) than those with a Master’s degree of (\(\bar{x}\) = 8.32) or Bachelor’s degree (\(\bar{x}\) = 8.16). Lecturers achieved the highest agreement score (\(\bar{x}\) = 8.42), while professors obtained the lowest agreement score (\(\bar{x}\) = 8.15). Professors who had five or fewer years of teaching experience had the highest agreement score (\(\bar{x}\) = 8.53), while those who had between 16 and 20 years of teaching experience had the lowest agreement score (\(\bar{x}\) = 7.99). Professors who expressed interest in participating in PD related to assessment had higher agreement scores (\(\bar{x}\) = 8.58) than those who were not interested in PD (\(\bar{x}\) = 7.86). The professors who self-defined themselves as “fair” obtained the highest agreement score (\(\bar{x}\) = 8.59), while those who self-defined themselves as “easy” obtained the lowest agreement score (\(\bar{x}\) = 7.92).

Table 4 Mean agreement scores based on independent variables

It appears that professors’ interest in PD and disposition made a difference for professors’ agreement scores. According to Table 5, the professors who were interested in PD and self-defined as “fair” obtained the highest agreement score (\(\bar{x}\) = 8.98), while those were not interested in PD and self-defined themselves as “easy” obtained the lowest agreement score (\(\bar{x}\) = 7.81).

Table 5 Mean agreement scores based on interest and disposition

To explore the impact of the factors on professors’ agreement score, we conducted multiple regression analysis. We examined the assumptions required for conducting multiple regression analysis. There were no influential cases and Cook’s D values were all under 1. The values of the residuals were approximately normally distributed based on the P–P plots for the model. Durbin–Watson statistic showed that the values of the residuals were independent (Durbin–Watson = 1.93), which is close to 2. Results from the multiple regression analysis (Table 6) indicate a statistically significant association between professors’ agreement score and the independent variables, F (525, 8) = 3.20, p < 0.01. Professors’ professional disposition seemed to be a significant predictor, and their agreement scores were significantly different between the professors who self-identified as being easy and those being fair (B = − 0.67, p < 0.01). The agreement scores were significantly higher for the professors who had interest in PD than those who had no interest in PD (B = 0.70, p < 0.01). Gender, highest degree, professional rank, and years of teaching experience were not significantly associated with professors’ agreement scores.

Table 6 Results of multiple regression analysis

Qualitative analysis

Being fair vs being caring

In reviewing the respondents’ explanations about each scenario, we identified the first theme as being fair vs being caring. About half of the respondents thought addressing only students' strengths in feedback (Scenario 1) was unethical and two-thirds of the participants considered it unethical to bump the student's participation grade up due to his problems at home (Scenario 10). These respondents believed feedback and assessment should be “fair” and “objective.” For example, one professor considered it unethical to address only students' strengths in feedback and stated, “It is unfair to address only students’ strengths, and addressing students’ weaknesses is more beneficial for their improvement.” The professors who held different opinions considered these assessment practices ethical for the reason that they thought positive feedback “does not hurt students,” and reflect professors’ “kind heart” and “care for” students. For Scenario 10, one professor thought of increasing a student's participation grade due to his problems at home as an ethical practice by explaining, “It suggests educator’s care for students, and this action should be approved.”

Similarly, more than half of the respondents thought that it was unethical to consider student effort in grading (Scenario 8). They also considered it unethical to add points to a difficult mid-term test to ensure students still could pass the course (Scenario 12). Respondents who expressed that these practices were unethical believed that these practices went against the rule of “fair assessment.” One professor viewed considering student effort as an unethical practice and wrote, “This practice is lack of fairness in grading, and it reflects teacher’s disrespect of grading criteria.” Other professors who considered these practices ethical believed that students should be “cared for” and deserve “better grades if they work harder.” For example, one professor considered it ethical to consider efforts in grading by explaining “Considering student effort is human nature, this caring nature is good for students’ learning attitude.”

In addition, in test administration, some professors considered it unethical to indicate to a student that she recorded her answers out of sequence (Scenario 6) because they believed this practice was “not fair to other students.” For those professors who considered this practice ethical, they believed professors should have a “caring heart” and a “strong responsibility” to remind students to attend to the sequence of their answers. One professor considered this practice in Scenario 6 an ethical practice and explained, “The teacher did not tell the student the answer to the question, it is a kind and caring reminder to a careless student.” Finally, some professors considered using only multiple-choice questions in a test (Scenario 14) was ethical because they believed it is “fair to all students.” On the contrary, some believed that using multiple ways of assessment (Scenario 3) was unethical because it was too “subjective” and might not “fairly” evaluate student learning.

Professors’ rights vs university policy

Another theme arising from analyzing the explanations of the respondents addressed the tension between university policy and professors’ rights. For example, in the scenario about adding points to students’ grades (Scenario 12), one professor wrote, “It is very common in a Chinese university testing environment. Some universities have rules about ensuring a certain percentage of passing or excellence rate.” Therefore, it posed a dilemma for the professors in either adding points to students’ grades or violating university rules. Similarly, some professors considered counting student attendance as 20% of their final scores (Scenario 15) ethical since they believed attendance is a “university disciplinary rule” and all students should adhere to this rule. For example, one professor thought of counting student attendance as 20% of their final scores as ethical and explained, “This is a commonly adopted rule for many universities in China. Classroom instruction is a very important part, and being absent from class might lead to students’ failure in final exam or cheating.” Evidently, university policy became an important factor that influences professors’ decisions on the ethicality of assessment practices.

Furthermore, professors’ rights appeared to be a key element that might impact their perceptions about some ethical issues in classroom assessment. For example, some professors stated that it is ethical to change a student’s score from B+ to A as long as evidence can be collected to indicate students’ mastery of the course objectives (Scenario 4). They supported this point of view by explaining that professors have the “rights in instruction and assessment,” and should be “flexible” in grading. Similarly, some professors considered it ethical to have some surprise items on the test (Scenario 5) because they believed professors should have the “rights” and “flexibility” to make decisions on test construction and preparation. For instance, one professor wrote, “Teachers have the rights to make decisions on the content to be included in a test, some surprising items might help differentiate students’ test performance.” In addition, some professors considered not providing rubrics as ethical (Scenario 7) by justifying that “rubrics should be decided by professors” and students should not be involved in developing the rubrics (i.e., scoring guide).

Discussion

We found that professors had consistent views with experts in Multiple Assessment Opportunities. However, they had split views (consistent views on some scenarios and inconsistent views on other scenarios) with experts in Bias/Fairness, Communication about Grading, Confidentiality, Grading Practice, and Test Administration. It suggests that professors’ perceptions of ethical issues varied even within the same category. For example, professors seemed to be aware of the confidentiality issue in assessment and considered sharing students’ grades in class as being unethical (Scenario 9). However, they seemed to neglect the confidentiality of students’ grades by agreeing that it is ethical to involve students in grading (Scenario 2). Professors did not necessarily have stronger awareness and higher agreement with experts on ethical issues within a specific category (e.g., confidentiality). Therefore, ethical issues should be discussed based on individual assessment scenario rather than category in professional development for faculty members in higher education.

Professors’ agreement scores showed that those with a Ph.D. degree had the highest agreement score, followed by those with a master’s degree or a bachelor’s degree. It suggests that a higher level of education might have an association with consistent views with experts. We also found that male professors obtained a slightly higher agreement score than female professors, which suggests that male professors had more consistent views with experts. The results seem to be explained by the finding that males have a better chance of entering higher education especially in pursuing graduate studies in China (Liu 2007). The current study also revealed that about 61% of male participants had a doctoral degree, while only 23% of female participants had a doctoral degree. It is important to know that in this study the perception differences between male professors and female professors were small.

Lecturers and assistant professors achieved higher agreement scores than professors and associate professors. This finding contradicted our initial assumption that professors who have a higher professional rank might have more awareness of fair assessment. Our study appeared to suggest that professors with lower professional ranks tend to have more consistent views with experts. This might be due to the educational reform and development in China. New faculty members might have more exposure to the educational assessment experience in western countries where student-centered instruction and fair assessment are broadly advocated. The results echoed the finding that an educator’s perception of a new policy is influenced by his or her pre-existing knowledge and worldview (Coburn 2005; Spillane 1999).

Professors’ interest in PD regarding assessment and their disposition significantly predicted their views of ethical issues in classroom assessment. Professors who reported interest in participating in PD related to assessment had much higher agreement scores than those who were not interested in PD. The professors who self-defined themselves as “fair” obtained the highest agreement score, while those who self-defined themselves as “easy” obtained the lowest agreement score. Those professors with greater interest in professional development might be more aware of the ethical issues in assessment practice. Universities should provide professors more opportunities of reading and discussing ethical issues in assessment.

The results from the qualitative analysis were consistent with previous studies. For example, Tierney (2013) studied, in particular, the fairness in classroom assessment and discussed in detail the components of fairness in assessment which include students’ learning transparency, classroom environment, critical reflection, and equal and equitable treatment (p. 55). Beets (2012) argued that the principles of caring should be infused in assessment. Similarly, Gilligan (1982) reported that some individuals considered ethics from a view of justice, whereas others considered ethics from a caring perspective in resolving ethical dilemmas. We recommend that the educational assessment guidelines should take caring into consideration while discussing the issue of fairness in assessment. The ideal solution is to maintain fair assessment with caring for the students.

It is useful to help educators shape their belief system about ethical issues in assessment. Caring and being flexible in assessment practice are important characteristics for professors as educators, and it is also important to maintain fair assessment. Professors should first ensure the fairness in assessment. This does not mean they do not care about students, instead, fair assessment provides unbiased information to students about their learning. Caring for students can be reflected in better instruction and after-school communications with students. Students should be treated fairly, and both their weaknesses and strengths should be addressed in a caring and positive way in classroom instruction and assessment. Assessment should be treated seriously, accurately, and, importantly, ethically.

Conclusions

This study used a mixed method of both quantitative and qualitative analysis to investigate Chinese university professors’ perceptions of ethical issues in classroom assessment, and the factors that impact their perceptions. The findings of this study can contribute to the literature of ethical issues in classroom assessment, be used to help develop culturally relevant ethical guidelines in classroom assessment, and provide implications for professional development for faculty members in higher education. We believe the factors impacting professors’ perceptions of ethical issues can help shape their overall views in classroom assessment, instruction, and higher education, which will further influence the fairness of assessment and the effectiveness of instruction.

The findings can contribute to the literature of ethical issues in classroom assessment. First, our study intended to explore professors’ views about ethical issues, which are guided by the magnitude of the knowledge they have regarding classroom assessment and professional ethics. This study revealed that professors’ knowledge of professional ethics in classroom assessment is somewhat limited. Previous relevant studies by Green et al. (2007), Johnson et al. (2008), Liu et al. (2016), and Fan et al. (2017) described educators’ perceptions of ethical issues in classroom assessment without considering the factors that might impact their decision making about ethicality in assessment. The current study was exploratory, and considered factors in the investigation of ethical issues in classroom assessment. Using a qualitative method to analyze professors’ justifications of their views and the rationality of their beliefs could help build knowledge of ethical issues in classroom assessment in higher education.

The findings of this study can be used to help develop culturally relevant ethical guidelines in classroom assessment. We identified the tension between university policy and professors’ rights through the qualitative analysis. It echoes with the findings by Pope et al. (2009) who identified ethical conflicts between institutional requirements and student need as the most prevalent category. They also identified ethical conflicts between institutional requirements and teacher need as the second most prevalent category. It appears that developing ethical guidelines should involve university administrators, policy makers, professors, and students as well. Institutional requirement, teacher need, and student need should be balanced to ensure fair and ethical assessment. Universities should take both student need and teacher need into consideration while making policies to ensure ethical assessment. Professors should be aware of their roles in instruction and assessment and the roles they play in their institutions.

Comparing with the studies conducted in the United States (Green et al. 2007; Johnson et al. 2008; Liu et al. 2016), we noticed that educators in China seem to have less awareness of ethical issues in classroom assessment regarding confidentiality and communications about grading. It suggests that these areas should be emphasized in faculty mentoring or professional development in Chinese higher education. Professional development is needed for professors to ensure fair and ethical assessment practices in their teaching. Traditionally, professors with higher degrees, higher professional ranks, and more years of teaching experience mentor those with lower degrees, lower professional ranks, and fewer years of teaching experience. However, this might not be the ideal choice in planning PD regarding the ethical issues in assessment. On the contrary, in mentoring it is important to help professors realize the importance and need of in-depth discussion of ethical issues in professional development. Those professors with greater interest in professional development might be more aware of the ethical issues in assessment practice. Universities should provide professors more opportunities of reading and discussing ethical issues in assessment.

There are some limitations of the study. Our study included fifteen scenarios in six categories, with only two to four scenarios in each category. More classroom assessment scenarios should be developed. Including sufficient number of scenarios in each category will be beneficial for examining the reliability of the scale. While reading through respondents’ explanations of the decisions on ethicality on several scenarios, we noted that quite a few respondents indicated that it was hard to judge a scenario as ethical or unethical. They emphasized that it depended on the context of the assessment. For example, in test administration, it was unethical to remind a specific student to check his/her answer to a question in a large-scale high-stake summative assessment. However, if the assessment was formative, small scale, not to be graded, and used for informing the professor and students about their learning, it is ethical to remind students to check their answers. Due to the complicated context of assessment, multiple methods should be used to collect evidence to help interpret the ethical issues in assessment practice. Although, we allowed respondents to explain their justification on ethicality, it was still not enough to get a comprehensive understanding of the ethical issues in different assessment contexts. We recommend future researchers should employ observations, interviews, and focus group discussions to explore the ethical issues in assessment practice.

Another limitation of the current study was about identifying the factors that impact professors’ perceptions on ethical issues in assessment practices. We only investigated such professor-level factors as gender, highest degree, professional rank, years of teaching experience, interest in professional development, and disposition. However, we realized that some university-level factors, such as university climate, geographic locations, and university rank, might also have an impact on professors’ perceptions on ethical issues in assessment. Future study should take these factors into consideration. A combination of professor-level factors and university-level factors might help better interpret the professors’ perceptions about ethical issues in classroom assessment practices. In addition, we explored the ethical issues in classroom assessment through the views of professors. The views of other stakeholders including students, administrators, and parents should be investigated in the future research. We believe that studying ethical issues from the views of various stakeholders could help raise the awareness of ethical assessment, and inform classroom instruction, assessment, and professional development.

References

  1. Airasian, P. (2000). Assessment in the classroom: A concise approach (2nd ed.). Boston: McGraw-Hill.

    Google Scholar 

  2. Beets, P. A. D. (2012). Strengthening morality and ethics in educational assessment through Ubuntu in South Africa. Educational Philosophy and Theory,44, 68–83.

    Article  Google Scholar 

  3. Boon, H. (2011). Raising the bar: Ethics education for quality teachers. Australian Journal of Teacher Education,36, 76–93.

    Article  Google Scholar 

  4. Brandt, D., & Rose, C. (2004). Global networking and universal ethics. AI & Society,18(4), 334–343.

    Article  Google Scholar 

  5. Brookhart, S. M., & Nitko, A. J. (2008). Assessment and grading in classrooms. Upper Saddle River, NJ: Pearson.

    Google Scholar 

  6. Campbell, E. (2013). Cultivating moral and ethical professional practice. In M. Sanger & R. Osguthorpe (Eds.), The moral work of teaching and teacher education: Preparing and supporting practitioners (pp. 29–44). New York, NY: Teachers College Press.

    Google Scholar 

  7. Chappuis, J., Stiggins, R., Chappuis, S., & Arter, J. (2012). Classroom assessment for student learning: Doing it right-using it well (2nd ed.). Boston: Pearson.

    Google Scholar 

  8. Coburn, C. E. (2005). Shaping teacher sensemaking: School leaders and the enactment of reading policy. Educational Policy,19(3), 476–509.

    Article  Google Scholar 

  9. Fan, X., Johnson, R., Liu, J., Zhang, X., Liu, X., & Zhang, T. (2019). A comparative study of pre-service teachers’ views on ethical issues in classroom assessment in China and the United States. Frontiers of Education in China.,14(2), 309–332.

    Article  Google Scholar 

  10. Fan, X., Johnson, R., & Liu, X. (2017). Chinese university professors’ perceptions about ethical issues in classroom assessment practices. New Waves Educational Research & Development,20(2), 1–19.

    Google Scholar 

  11. Gilligan, C. (1982). In a different voice: Psychological theory and women’s development. Cambridge, MA: Harvard University Press.

    Google Scholar 

  12. Glanzer, P. L., & Ream, T. C. (2007). Has teacher education missed out on the “ethics boom”? A comparative study of ethics requirements and courses in professional majors of Christian colleges and universities. Christian Higher Education,6(4), 271–288.

    Article  Google Scholar 

  13. Green, S., Johnson, R., Kim, D., & Pope, N. (2007). Ethics in classroom assessment practices: Issues and attitudes. Teaching and Teacher Education,23(7), 999–1011.

    Article  Google Scholar 

  14. Gronlund, N. (2003). Assessment of student achievement (7th ed.). Boston: Allyn and Bacon.

    Google Scholar 

  15. Plato (2009). Republic. In R. C. Solomon, C. W. Martin, & W. Vaught (Eds.), Book I. Morality and the good life: An introduction to ethics through the classical sources (5th ed.), G. M.A. Grube, trans. Boston, MA: McGraw-Hill.

  16. Haladyna, T. M., Nolen, S. B., & Haas, N. S. (1991). Raising standardized achievement test scores and the origins of test score pollution. Educational Researcher,20, 2–7.

    Article  Google Scholar 

  17. Husu, J. (2001). Teachers at cross-purposes: A case-report approach to the study of ethical dilemmas in teaching. Journal of Curriculum and Supervision,17(1), 67–89.

    Google Scholar 

  18. Johnson, R., Green, S., Kim, D., & Pope, N. (2008). Educational leaders’ perceptions about ethical assessment practices. The American Journal of Evaluation,29(4), 520–530.

    Article  Google Scholar 

  19. Joint Committee on Standards for Educational Evaluation. (2003). The student evaluation standards. Thousand Oaks, CA: Corwin Press.

    Google Scholar 

  20. Liu, B., & Li, Y. (2010). Opportunities and barriers: Gendered reality in Chinese higher education. Frontiers of Education in China,5(2), 197–221.

    Article  Google Scholar 

  21. Liu, J., Johnson, R., & Fan, X. (2016). A comparative study of Chinese and United States pre-service teachers’ perceptions about ethical issues in classroom assessment. Studies in Educational Evaluation,48, 56–66.

    Article  Google Scholar 

  22. Liu, Y. S. (2007). Women entering the elite group: A progress with heavy cost. Beijing Normal University First Educational Sociology Forum Proceedings,2, 508–531.

    Google Scholar 

  23. Mehrens, W. A., & Kaminski, J. (1989). Methods for improving standardized test scores: Fruitful, fruitless, or fraudulent? Educational Measurement: Issues and Practice,8(1), 14–22.

    Article  Google Scholar 

  24. Miles, M. B., Huberman, A. M., & Saldaña, J. (2014). Qualitative data analysis: A methods sourcebook. Thousand Oaks, CA: SAGE.

    Google Scholar 

  25. National Council for Accreditation of Teacher Education. (2008). Professional standards for the accreditation of teacher preparation institutions. Retrieved from https://www.ncate.org/public/standards.asp.

  26. O’Leary, M. O. (2008). Towards an agenda for professional development in assessment. Journal of In-service Education,34, 109–114.

    Article  Google Scholar 

  27. Oosterhof, A. (2009). Developing and using classroom assessments (4th ed.). Upper Saddle River, NJ: Merrill.

    Google Scholar 

  28. Ory, J., & Ryan, K. (1993). Tips for improving testing and grading. Newbury Park, CA: Sage.

    Google Scholar 

  29. Özbek, O. (2013). Physical education teachers’ types of analyzing professional ethical dilemmas. Life Science Journal,10(1), 2670–2678.

    Google Scholar 

  30. Pope, N., Green, S., Johnson, R., & Mitchell, M. (2009). Examining teacher ethical dilemmas in classroom assessment. Teaching and Teacher Education,25, 778–782.

    Article  Google Scholar 

  31. Pope, N. S. (2006). Do no harm to whom?. An examination of ethics and assessment. South Atlantic Philosophy of Education Society Yearbook (pp. 25–31).

  32. Popham, W. J. (1991). Appropriateness of teachers’ test preparation practices. Educational Measurement: Issues and Practice,10(4), 12–15.

    Article  Google Scholar 

  33. Popham, W. J. (2000). Modern educational measurement: Practical guidelines for educational leaders. Needham, MA: Allyn & Bacon.

    Google Scholar 

  34. Popham, W. J. (2017). Classroom assessment: What teachers need to know (8th ed.). Boston, MA: Pearson.

    Google Scholar 

  35. Quesenberry, L. G., Phillips, J., Woodburne, P., & Yang, C. (2012). Ethics assessment in a general education programme. Assessment & Evaluation in Higher Education.,37(2), 193–213.

    Article  Google Scholar 

  36. Richardson, D., & Onwuegbuzie, A. J. (2004). Attitudes toward dispositions of teachers. Academic Exchange Quarterly,8(3), 31–35.

    Google Scholar 

  37. Ryan, A. (1997). Professional obligement: A dimension of how teachers evaluate their students. Journal of Curriculum and Supervision,12(2), 118–134.

    Google Scholar 

  38. Sax, G. (1974). Principles of educational measurement and evaluation. Belmont, CA: Wadsworth.

    Google Scholar 

  39. Spillane, J. P. (1999). State and local government relations in the era of standards-based reform: Standards, state policy instruments, and local instructional policy making. Educational Policy,13(4), 546–572.

    Article  Google Scholar 

  40. Stiggins, R. J., Frisbie, R. J., & Griswold, P. A. (1989). Inside high school grading practices: Building a research agenda. Educational Measurement: Issues and Practice,8(2), 5–14.

    Article  Google Scholar 

  41. Strike, K., & Soltis, J. (1992). The ethics of teaching (2nd ed.). New York: Teachers College Press.

    Google Scholar 

  42. Taylor, K., & Nolen, S. (2005). Classroom assessment: Supporting teaching and learning in real classrooms. Upper Saddle River, NJ: Pearson Education.

    Google Scholar 

  43. Thorndike, R., Cunningham, G., Thorndike, R., & Hagen, E. (1991). Measurement and evaluation in psychology and education (5th ed.). New York: MacMillan.

    Google Scholar 

  44. Tierney, R. D. (2013). Fairness as a multifaceted quality in classroom assessment. Studies in Educational Evaluation,43, 55–69.

    Article  Google Scholar 

  45. Tirrir, K. (1999). Teachers’ perceptions of moral dilemmas at school. Journal of Moral Education,28(1), 31–47.

    Article  Google Scholar 

  46. Waley, A. (Trans.). (1989). The analects of Confucius. New York: Vintage Books.

  47. Warnick, B. R., & Silverman, S. K. (2011). A framework for professional ethics courses in teacher education. Journal of Teacher Education,62(3), 273–285.

    Article  Google Scholar 

  48. Wiggins, G. (1994). None of the above. Executive Educator,16(7), 14–18.

    Google Scholar 

  49. Worthen, B. R., White, K. R., Fan, X., & Sudweeks, R. (1998). Measurement and assessment in the schools (2nd ed.). Boston, MA: Allyn & Bacon of Pearson.

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Xumei Fan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Scenarios and percent of respondents who agreed with experts

Appendix: Scenarios and percent of respondents who agreed with experts

Scenarios % Agreement
Fairness/bias  
 1. To enhance self-esteem, a professor addresses only students' strengths when giving feedback to her students’ assignments since she believes that positive feedback is good for students’ growth 48.0
 10. A professor who knows a student had a bad week because of problems at home bumps the student's participation grade up a few points to compensate for his bad score on a quiz 63.3
Communication about grading  
 5. For the class-level final exam, a professor uses a few surprise items about additional topics that were covered in class but were not listed in the study guide 9.4
 7. When assigning a team project to work on collaboratively, a professor does not provide rubrics on how it will be graded, stating instead that he will assign a score based on students’ overall performance on the project 31.4
 11. At the beginning of the semester, a professor shares with students the rubrics for each task. The professor leads students in a discussion about the rubrics, makes changes to the rubrics according to students’ feedback, and gives students the final versions to guide their completion of the course tasks 92.5
Confidentiality  
 2. A professor does not grade all class-level quizzes. Instead, he lets students grade each other’s paper and then share the results in groups 19.9
9. At the beginning of the class, when a student requests to see her grade of a final exam, her professor shows the student the whole score sheet that includes all students’ final scores 73.2
Grading practice  
 4. As a professor finalizes grades, she notices the grade of a student is in between B+ and A. She gave the student an A− because tests and papers showed the student had mastered the course objectives even though he had not completed some of his homework assignments 80.8
 8. In grading a final exam, a professor always reads the student’s name and considers effort in assigning grades 65.0
 12. A professor is concerned that most students did not perform well on the class-level mid-term test. Based on the results, it has become mathematically impossible for about 70% of students to earn a passing grade. Thus, the professor adds 20 points to each student’s mid-term score to make sure most students still have a chance to pass at the end of the semester 63.9
 15. A college professor counts students’ attendance as 20% of their final grades 17.8
Multiple assessment opportunities  
 3. A professor uses observational checklists, anecdotal notes, and interviews in assessing students 87.2
 14. An instructor uses only multiple-choice questions in the end-of-course exam. She justifies this practice by stating multiple-choice questions can be graded objectively and efficiently 71.7
Test administration  
 6. While administering a final exam, a professor notices that a student has skipped a problem and is recording all of her answers out of sequence on the answer sheet. The professor shows the student where to record the answer she is working on, and instructs the student to put the answer to each question with the same number on the answer sheet 34.0
 13. While administering a class-level mid-term test, a professor notices that most students missed the same question. The professor reminds all students to check their answers to that question one more time 80.2

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fan, X., Liu, X. & Johnson, R.L. A mixed method study of ethical issues in classroom assessment in Chinese higher education. Asia Pacific Educ. Rev. 21, 183–195 (2020). https://doi.org/10.1007/s12564-019-09623-y

Download citation

Keywords

  • Classroom assessment
  • Higher education
  • Ethical issues
  • Mixed method