Liberal Education

Reevaluating Teaching Evaluations

College professors are not the only highly educated professionals who debate the value of evaluations from those in the demographic they serve. A nontrivial number of doctors, lawyers, and journalists also take issue with evaluations from, respectively, their patients, law clients, and readers. Indeed, the argument against such ratings is a strong one. That is, these individuals are not, in nearly all cases, qualified to assess teaching, medicine, the law, or journalism. Without knowledge of what constitutes best practice in these fields, unqualified evaluators may not actually be judging skill but rather the “hotness” of a professor, the severity of a doctor’s diagnosis, or even the attire of a lawyer.

A backlash against such reviews has been simmering for some time. The Atlantic has compared most online comment sections to sewers, stating that they are “logistically required, but consistently disgusting, subterranean conduits for what is, technically speaking, waste.”1  Popular Science abruptly ended online comments from readers, observing that comments not only polarized readers but changed readers’ understandings of the stories. Explaining the Huffington Post’s recent decision no longer to permit anonymous comments, founder and editor-in-chief Arianna Huffington noted that “freedom of expression is given to people who stand up for what they’re saying and who are not hiding behind anonymity.”2 Ardis Dee Hoven, president of the American Medical Association, has stated that anonymous online comments about physicians should be “taken with a grain of salt.”3

Some of the disgruntled in other professional sectors have attempted to game the system. The New York Times ran a story about a company that offered its customers a ten-dollar rebate for posting online reviews of their products on Amazon.4 (Spoiler alert: nearly all of this company’s reviews were great!) Come to think of it, the actions of this company are not all that different from college professors offering doughnuts or extra credit to students who complete teaching evaluation surveys, presumably a quid pro quo for favorable evaluations.

What is the modern college or university to do? Do we surrender to the Yelp-ification of a society that suggests a Reddit-like “wisdom of the crowds” mentality should prevail? Or, in a nod to academe’s rebellious roots, do we (in a term coined by the Atlantic) “pull a PopSci” and accept no student comments?

Feedback, not evaluations

The answer is not to ban student comments. Even Popular Science and the Huffington Post continue to accept feedback in the form of letters to the editor. Colleges and universities might take a cue from journalists by encouraging formal written comments from students concerning classroom experiences. Unlike doctors, lawyers, and journalists, professors are actually in the business of teaching students to think critically and write effectively. It would be hypocritical for faculty to discourage students from thinking and writing about any topic, including our ability to reach them as students.

However, given that language shapes how we think, we propose new terminology. Surveys provide students an opportunity for feedback about teachers, not evaluations of teachers. Students, professors, and administrators should not view the surveys as an opportunity to judge a professor, but as an opportunity to provide grist for the faculty member’s mill. Like some writers of letters to the editor, some students may rant, some may be insightful, some may be boring, and some may move us to think in ways that we had never thought before. Our students’ critical thinking and writing abilities will be uneven. Nevertheless, we must take to heart our duty to encourage students to develop—not suppress—these skills.

Of course, any policy that solicits student feedback should have integrity. In order to guard against survey-induced grade inflation, anonymity should be granted to student commenters as long as grades are pending. Once grades are released, however, student commenters should be identifiable to instructors in order to provide a context for the feedback. Negative feedback from a conscientious student should be interpreted differently from negative feedback from a disengaged student. Positive feedback from a student who rarely showed up to class must be understood differently from positive feedback from an engaged student. Academe has a strong precedent for transparent feedback. That is, identifiable professors assign grades to students. With this context, students are able to interpret, for example, a grade of A. An A from a challenging professor means something entirely different from an A from an easy professor. Faculty should be afforded the same context that is provided for students.

Researchers have written thousands of peer-reviewed articles on student-authored teaching evaluations—we found articles dating as far back as 1929—but virtually all relate to evaluations of college professors rather than educators below the college level. Why suddenly does opinion matter more once a student is in college? We imagine the answer is that colleges feel obligated to justify cost. This may also explain why student evaluations persist in the absence of research attesting to their validity or reliability.

Survey design and implementation

Nevertheless, college and university administrators who decide that student feedback surveys must generate quantitative data—a decision that should not be a foregone conclusion—should consult faculty with knowledge of best practices in data collection and analysis, such as statisticians, mathematicians, and psychologists. In particular, they should seek to ensure that statistical analysis includes not just mean scores but margins of error. A recent study argues persuasively that student evaluations of teachers are misunderstood and improperly used by both administrators and faculty precisely because margins of error are not considered.5 There is a big difference between a rating of 3.0 on a 4-point scale when the margin of error is 0.2 and when the margin is 2.0. Further, in writing the questions that will ultimately generate the quantitative data, administrators should consult with faculty who are knowledgeable about best practices in education, such as education faculty.

To avoid inaccuracies associated with incomplete, nonrandom samples, administrators should solicit feedback from the entire student body. Otherwise, historically marginalized populations on campus, including low-income, minority, and first-generation college students, may not understand that their feedback is desired. To achieve complete data sets, colleges and universities should withhold grades for students who do not complete surveys, just as they do when students do not meet financial obligations. Of course, students should be allowed to abstain by, for example, checking a box on the survey marked, “I choose to abstain.” Students who formally abstain should not be subject to withheld grades.

The administration—not the faculty—should take responsibility for administering and collecting student feedback surveys. A faculty member’s involvement in administering the surveys presents an inherent conflict of interest—think of the doughnuts and abundant extra-credit points—that could intentionally or unintentionally bias the feedback.

Colleges and universities should never rely on student feedback surveys as their only form of assessment for either full-time or adjunct faculty. When it comes to faculty assessment, colleges and universities must rely on professionals—deans or department chairs with training in assessment—to conduct class observations. Faculty should also be invited to submit external data—such as student grades on a common final exam, general education assessment scores, or scores on a national exam such as the Praxis—concerning student achievement that are independent of the student’s opinion and dependent on how much the student learned. A faculty member with low ratings on student feedback surveys but excellent results on the students’ common final exams should be valued differently from a faculty member with high ratings on student feedback surveys but poor results on the students’ common final exams. Without incorporating external data into faculty assessment, student feedback surveys will likely induce grade inflation.

BuzzFeed recently analyzed the word choice, sentence length, and other features of comments from various news websites.6 CNN commenters are apparently writing for readers at the seventh-grade level, while New York Times commenters write for readers at the tenth-grade level. The grade level of higher education’s population of undergraduates is educationally homogeneous, by definition. Would our students actually clock in as the undergraduate writers and readers they are? If so, we could count on reading well-supported and thoughtful written comments. If not, we might reflect on whether their inability to write at an undergraduate level is, on its own, a negative statement about our teaching effectiveness.

Ultimately, it is up to the administration to ensure the integrity of data collection methods. And it is up to the faculty to foster students’ abilities to think critically and write effectively every day, including the days they are asked to complete student feedback surveys.

To respond to this article, e-mail, with the authors’ names on the subject line.


1. Derek Thompson, “The Case for Banning Internet Commenters (Just Not Everywhere),” The Atlantic, September 24, 2013,

2. Elizabeth Landers, “Huffington Post to Ban Anonymous Comments,” CNN, August 22, 2013,

3. Jessica Merritt, “Online Reputation Management for Doctors a Growing Concern,” Online Reputation Management, February 20, 2014,

4. David Streitfeld, “For $2 a Star, an Online Retailer Gets 5-star Product Reviews,” New York Times, January 26, 2012,

5. G. A. Boysen, T. J. Kelly, H. N. Raesly, and R. W. Casner, “The (Mis)interpretation of Teaching Evaluations by College Faculty and Administrators,” Assessment & Evaluation in Higher Education 39, no. 6 (2014): 641–56.

6. Anna North, “Where to Find the Smartest Commenters on the Internet,” BuzzFeed, May 9, 2013,

Susan D’Agostino is associate professor of mathematics, and Jay Kosegarten is assistant professor of psychology, both at Southern New Hampshire University.

Previous Issues