Skip to main content

What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Abstract

Student ratings of teaching play a significant role in career outcomes for higher education instructors. Although instructor gender has been shown to play an important role in influencing student ratings, the extent and nature of that role remains contested. While difficult to separate gender from teaching practices in person, it is possible to disguise an instructor’s gender identity online. In our experiment, assistant instructors in an online class each operated under two different gender identities. Students rated the male identity significantly higher than the female identity, regardless of the instructor’s actual gender, demonstrating gender bias. Given the vital role that student ratings play in academic career trajectories, this finding warrants considerable attention.

This is a preview of subscription content, access via your institution.

Figure 1

Notes

  1. To clarify the language we use throughout the paper, we refer to all three persons responsible for grading and directly interacting with students as “instructors.” The course “professor” was the person responsible for course design and content preparation, while the two “assistant instructors” worked under the professor’s direction to manage and teach their respective discussion groups.

  2. A one-way ANOVA test confirmed that there was no significant variation among all six groups’ discussion board grades and overall grades for the course.

  3. We acknowledge that the application of parametric analytical techniques (ANOVA, MANOVA, and t-tests) to ordinal data (the Likert scale responses) remains controversial among social scientists and statisticians. (See Knapp (1990) for a relatively balanced review of the debate.) We side with the arguments of Gaito (1980) and Armstrong (1981) and argue that it is appropriate to do so in our case as the concept being measured is interval, even if the data labels are not. This practice is common within higher education research. (e.g. Centra & Gaubatz [2000] Young, Rush, & Shaw [2009]; Basow [1995]; and Knol et al. [2013])

  4. While we acknowledge that a significance level of .05 is conventional in social science and higher education research, we side with Skipper, Guenther, and Nass (1967), Labovitz (1968), and Lai (1973) in pointing out the arbitrary nature of conventional significance levels. Considering our study design, we have used a significance level of .10 for some tests where: 1) the results support the hypothesis and we are consequently more willing to reject the null hypothesis of no difference; 2) our hypothesis is strongly supported theoretically and by empirical results in other studies that use lower significance levels; 3) our small n may be obscuring large differences; and 4) the gravity of an increased risk of Type I error is diminished in light of the benefit of decreasing the risk of a Type II error (Labovitz, 1968; Lai, 1973).

References

  • Abrami, P. C., d’Apollonia, S., & Rosenfield, S. (2007). The dimensionality of student ratings of instruction: What we know and what we do not. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 385–445). Dordrecht, The Netherlands: Springer.

    Chapter  Google Scholar 

  • Acker, J. (1990). Hierarchies, job, and bodies: A theory of gendered organizations. Gender and Society, 4, 81–95.

    Article  Google Scholar 

  • Andersen, K., & Miller, E. D. (1997). Gender and student evaluations of teaching. Ps-Political Science and Politics, 30, 216–219.

    Article  Google Scholar 

  • Armstrong, G. D. (1981). Parametric statistics and ordinal data: A pervasive misconception. Nursing Research, 30, 60–62.

    Article  Google Scholar 

  • Bachen, C. M., McLoughlin, M. M., & Garcia, S. S. (1999). Assessing the role of gender in college students' evaluations of faculty. Communication Education, 48, 193–210.

    Article  Google Scholar 

  • Basow, S. A. (1995). Student evaluations of college professors: When gender matters. Journal of Educational Psychology, 87, 656–665.

    Article  Google Scholar 

  • Basow, S. A., & Montgomery, S. (2005). Student ratings and professor self-rating of college teaching: Effects of gender and divisional affiliation. Journal of Personnel Evaluation in Education, 18, 91–106.

    Article  Google Scholar 

  • Basow, S. A., Phelan, J. E., & Capotosto, L. (2006). Gender patterns in college students' choices of their best and worst professors. Psychology of Women Quarterly, 30, 25–35.

    Article  Google Scholar 

  • Basow, S. A., & Silberg, N. T. (1987). Student evaluations of college professors: Are female and male professors rated differently? Journal of Educational Psychology, 79, 308–314.

    Article  Google Scholar 

  • Bennett, S. K. (1982). Student perceptions of and expectations for male and female instructors: Evidence relating to the question of gender bias in teaching evaluation. Journal of Educational Psychology, 74, 170–179.

    Article  Google Scholar 

  • Benton, S. L., & Cashin, W. E. (2014). Student ratings of instruction in college and university courses. In M. B. Paulsen (Ed.), Higher education: Handbook of theory and research (pp. 279–326). Dordrecht, The Netherlands: Springer.

    Chapter  Google Scholar 

  • Burns-Glover, A. L., & Veith, D. J. (1995). Revisiting gender and teaching evaluations: Sex still makes a difference. Journal of Social Behavior and Personality, 10, 69–80.

    Google Scholar 

  • Centra, J. A. (2007). Differences in responses to the student instructional report: Is it bias? Princeton, NJ: Educational Testing Service.

    Google Scholar 

  • Centra, J. A., & Gaubatz, N. B. (2000). Is there gender bias in student evaluations of teaching? Journal of Higher Education, 71, 17–33.

    Article  Google Scholar 

  • Chamberlin, M. S., & Hickey, J. S. (2001). Student evaluations of faculty performance: The role of gender expectations in differential evaluations. Educational Research Quarterly, 25, 3–14.

    Google Scholar 

  • Curtis, J. W. (2011). Persistent inequity: Gender and academic employment. Report from the American Association of University Professors. Retrieved from http://www.aaup.org/NR/rdonlyres/08E023AB-E6D8-4DBD-99A0-24E5EB73A760/0/persistent_inequity.pdf

  • Dalmia, S., Giedeman, D. C., Klein, H. A., & Levenburg, N. M. (2005). Women in academia: An analysis of their expectations, performance and pay. Forum on Public Policy, 1, 160–177.

    Google Scholar 

  • Davis, B. G. (2009). Tools for teaching (2nd ed.). San Francisco, CA: Jossey-Bass.

    Google Scholar 

  • Feldman, K. A. (1992). College students’ views of male and female college teachers: Evidence from the social laboratory and experiments – Part 1. Research in Higher Education, 33, 317–375.

    Article  Google Scholar 

  • Feldman, K. A. (1993). College students’ views of male and female college teachers: Evidence from the social laboratory and experiments – Part 2. Research in Higher Education, 34, 151–211.

    Article  Google Scholar 

  • Gaito, J. (1980). Measurement scales and statistics: Resurgence of an old misconception. Psychological Bulletin, 87, 564–567.

    Article  Google Scholar 

  • Garson, G. D. (2012). General linear models: Multivariate GLM & MANOVA/MANCOVA. Asheboro, NC: Statistical Associates.

    Google Scholar 

  • Goldberg, P. (1968). Are women prejudiced against women? Trans-action, 5, 28–30.

    Google Scholar 

  • Greenwald, A. G. (1997). Validity concerns and usefulness of student ratings of instruction. American Psychologist, 52, 1182–1186.

    Article  Google Scholar 

  • Hair, J. F., Jr., Anderson, R. E., Tatham, R. L., & Black, W. C. (1998). Multivariate data analysis with readings (5th ed.). Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Hampton, S. E., & Reiser, R. A. (2004). Effects of a theory-based feedback and consultation process on instruction and learning in college classrooms. Research in Higher Education, 45, 497–527.

    Article  Google Scholar 

  • Johnson, V. E. (2003). Grade inflation: A crisis in college education. New York, NY: Springer.

    Google Scholar 

  • Johnson, A. (2006). Power, privilege, and difference. Boston, MA: McGraw-Hill.

    Google Scholar 

  • Knapp, T. R. (1990). Treating ordinal scales as interval scales: An attempt to resolve the controversy. Nursing Research, 39, 121–123.

    Article  Google Scholar 

  • Knol, M. H., Veld, R., Vorst, H. C. M., van Driel, J. H., & Mellenbergh, G. J. (2013). Experimental effects of student evaluations coupled with collaborative consultation on college professors’ instructional skills. Research in Higher Education, 54, 825–850.

    Article  Google Scholar 

  • Labovitz, S. (1968). Criteria for selecting a significance level: A note on the sacredness of.05. The American Sociologist, 3, 220–222.

    Google Scholar 

  • Lai, M.K. (1973). The case against tests of statistical significance. Report from the Teacher Education Division Publication Series. Retrieved from http://files.eric.ed.gov/fulltext/ED093926.pdf

  • Liu, O. L. (2012). Student evaluation of instruction: In the new paradigm of distance education. Research in Higher Education, 53, 471–486.

    Article  Google Scholar 

  • Lorber, J. (1994). Paradoxes of gender. New Haven, CT: Yale University Press.

    Google Scholar 

  • Marsh, H. W. (2001). Distinguishing between good (useful) and bad workloads on students’ evaluations of teaching. American Educational Research Journal, 38, 183–212.

    Article  Google Scholar 

  • Marsh, H. W. (2007). Students’ evaluations of university teaching: Dimensionality, reliability, validity, potential biases and usefulness. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 319–383). Dordrecht, The Netherlands: Springer.

    Chapter  Google Scholar 

  • Miller, J., & Chamberlin, M. (2000). Women are teachers, men are professors: A study of student perceptions. Teaching Sociology, 28, 283–298.

    Article  Google Scholar 

  • Monroe, K., Ozyurt, S., Wrigley, T., & Alexander, A. (2008). Gender equality in academia: Bad news from the trenches, and some possible solutions. Perspectives on Politics, 6, 215–233.

    Article  Google Scholar 

  • Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. Cambridge, MA: Cambridge University Press.

    Book  Google Scholar 

  • Morris, L. V. (2011). Women in higher education: Access, success, and the future. Innovative Higher Education, 36, 145–147.

    Article  Google Scholar 

  • Morrison, K., & Johnson, T. (2013). Editorial. Educational Research and Evaluation, 19, 579–584.

    Article  Google Scholar 

  • Murray, H. G. (2007). Low-inference teaching behaviors and college teaching effectiveness: Recent developments and controversies. In R. P. Perry & J. C. Smart (Eds.), The scholarship of teaching and learning in higher education: An evidence-based perspective (pp. 145–183). Dordrecht, The Netherlands: Springer.

    Chapter  Google Scholar 

  • O’Sullivan, P. D., Hunt, S. K., & Lippert, L. R. (2004). Mediated immediacy: A language of affiliation in a technological age. Journal of Language and Social Psychology, 23, 464–490.

    Article  Google Scholar 

  • Paludi, M. A., & Strayer, L. A. (1985). What’s in an author’s name? Differential evaluations of performance as a function of author’s name. Sex Roles, 12, 353–361.

    Article  Google Scholar 

  • Perry, R. P., & Smart, J. C. (Eds.). (2007). The scholarship of teaching and learning in higher education: An evidence-based perspective. Dordrecht, The Netherlands: Springer.

    Google Scholar 

  • Risman, B. J. (2004). Gender as a social structure: Theory wrestling with activism. Gender & Society, 18, 429–450.

    Article  Google Scholar 

  • Rowden, G. V., & Carlson, R. E. (1996). Gender issues and students' perceptions of instructors' immediacy and evaluation of teaching and course. Psychological Reports, 78, 835–839.

    Article  Google Scholar 

  • Sandler, B. R. (1991). Women faculty at work in the classroom, or, why it still hurts to be a woman in labor. Communication Education, 40, 6–15.

    Article  Google Scholar 

  • Sidanius, J., & Crane, M. (1989). Job evaluation and gender: The case of university faculty. Journal of Applied Social Psychology, 19, 174–197.

    Article  Google Scholar 

  • Simeone, A. (1987). Academic women: Working toward equality. South Hadley, MA: Bergin and Garvey.

    Google Scholar 

  • Skipper, J. K., Guenther, A. C., & Nass, G. (1967). The sacredness of.05: A note concerning the uses of statistical levels of significance in social science. The American Sociologist, 1, 16–18.

    Google Scholar 

  • Sprague, J., & Massoni, K. (2005). Student evaluations and gendered expectations: What we can't count can hurt us. Sex Roles, 53, 779–793.

    Article  Google Scholar 

  • Statham, A., Richardson, L., & Cook, J. A. (1991). Gender and university teaching: A negotiated difference. Albany, NY: State University of New York Press.

    Google Scholar 

  • Subramanya, S. R. (2014). Toward a more effective and useful end-of-course evaluation scheme. Journal of Research in Innovative Teaching, 7, 143–157.

    Google Scholar 

  • Svanum, S., & Aigner, C. (2011). The influences of course effort, mastery and performance goals, grade expectancies, and earned course grades on student ratings of course satisfaction. British Journal of Educational Psychology, 81, 667–679.

    Article  Google Scholar 

  • Svinicki, M., & McKeachie, W. J. (2010). McKeachie’s teaching tips: Strategies, research, and theory for college and university teachers (13th ed.). Belmont, CA: Wadsworth.

    Google Scholar 

  • Theall, M., Abrami, P. C., & Mets, L. A. (Eds.). (2001). The student ratings debate: Are they valid? How can we best use them? San Francisco, CA: Jossey-Bass.

    Google Scholar 

  • West, C., & Zimmerman, D. H. (1987). Doing gender. Gender & Society, 1, 125–151.

    Article  Google Scholar 

  • Young, S., Rush, L., & Shaw, D. (2009). Evaluating gender bias in ratings of university instructors' teaching effectiveness. International Journal of Scholarship of Teaching and Learning, 3, 1–14.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lillian MacNell.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

MacNell, L., Driscoll, A. & Hunt, A.N. What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching. Innov High Educ 40, 291–303 (2015). https://doi.org/10.1007/s10755-014-9313-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10755-014-9313-4

Keywords

  • gender inequality
  • gender bias
  • student ratings of teaching
  • student evaluations of instruction