Should Student Evaluations of Teaching be Nixed?

How should colleges and universities evaluate the teaching performance of their faculty members? While there are numerous possibilities, student evaluations of teaching are often the primary tool used to rate professors in the classroom. Although students’ assessments of their teachers can provide valuable feedback, a recent study by the American Political Science Association (APSA) confirmed previous assertions that student evaluations are often biased against women.

Although abundant evidence makes clear that student evaluations don’t correlate with student learning outcomes or even teaching effectiveness, gender (and other) bias in these evaluations might be an even more significant impediment to their usefulness. As the APSA study’s coauthor Kristina Mitchell asserts, student evaluations are often the principal way decisions are made regarding hiring, salary, promotions, and tenure. She argues further that using these evaluations is discriminatory. “The Equal Employment Opportunity Commission exists to enforce the laws that make it illegal to discriminate against a job applicant or employee based on sex. If the criteria for hiring and promoting faculty members is based on a metric that is inherently biased against women, is it not a form of discrimination?”

So how is gender bias detectable in student evaluations? The APSA study found that the language students use when evaluating male professors differs significantly from the language they use to evaluate female professors. For example, students tend to refer to female professors as teachers but to their male counterparts as professors, even when the professors’ educational and professional qualifications are equal. In another research project, based on word usage in reviews of teachers on the Web site Rate My Professor, researchers found that male faculty members are more likely to be described as “funny,” “brilliant,” “genius,” and “arrogant,” while female faculty members are more likely to be described as “approachable,” “helpful,” “nice,” and “bossy.” Moreover, the APSA study found that students tend to comment on women’s appearance and personality far more often than that of men, who are judged on their intelligence and competency. Especially telling are the differences the APSA researchers found in evaluation scores for online courses administered identically and differing only in the gender of the instructor’s name. The data clearly showed that across all categories the professors believed to be male received higher scores than the professors believed to be female, even if the scores were unrelated to the individual instructor’s ability, demeanor, or attitude. For example, in a comparison of two classes, a professor believed to be female received lower ratings than a professor believed to be male on questions about textbook relevancy and about the appropriateness of the technology used for delivering instruction, even though those variables were identical for both classes. Moreover, although requirements and assignments for both classes were the same, students’ ratings on those metrics also exposed gender bias; of the twenty-three questions asked, there were none for which the instructor believed to be female received a higher rating than the instructor believed to be male.

Research on racial and ethnic bias on student evaluations has revealed equally problematic outcomes, confirming that such evaluations benefit certain faculty members, primarily white males, more than others. Critics of these evaluations argue that these assessment tools reward those already in positions of power and protect the status quo. As long as these assessments are used for decisions about professional advancement, they assert, women and minorities will continue to be underrepresented in tenure-track and high-level positions in the academy. Mounting evidence of bias was enough for the University of Southern California to stop using student evaluations in tenure and promotion decisions. The provost, Michael Quick, put it simply: “I’m done. I can’t continue to allow a substantial portion of the faculty to be subject to this kind of bias.” Although the university will continue to use student evaluations for feedback to help faculty members adjust their teaching practices, the format of these assessments has been revised to eliminate bias-prone questions. And although more comprehensive evaluation tools are time-consuming and present other challenges, most faculty members and administrators agree the investment is worth it. As an article in Inside Higher Ed indicates, professors and pedagogical experts have been asking for more accurate and transparent evaluations for years: “A 2015 survey of 9,000 faculty members by the American Association of University Professors, for instance, found that 90 percent of respondents wanted their institutions to evaluate teaching with the same seriousness as research and scholarship.” Other schools, like the University of Oregon, are following suit. As study after study confirms the inadequacy of student evaluations as professional assessment tools, more colleges and universities will likely look to these early adopters as models. As Michelle Falkoff concludes in The Chronicle of Higher Education, “[I]f academic institutions do not take steps to assess teaching more holistically, they run the risk of losing talented faculty members for reasons that are not only inappropriate but may well be illegal.”

Photo by University of Memphis