Magazine article Academe

Student Teaching Evaluations: Inaccurate, Demeaning, Misused

Magazine article Academe

Student Teaching Evaluations: Inaccurate, Demeaning, Misused

Article excerpt

Administrators love student teaching evaluations. Faculty need to understand the dangers of relying on these flawed instruments.

Fifty years ago, students at Harvard University and the University of California, Berkeley, were publishing guides rating teachers and courses. Irreverent and funny, they featured pungent comments: "Trying to understand Professor X's lectures is like slogging uphill through molasses," or "Dr. Y communicated very closely with the blackboard, but I couldn't tell you what he looks like, as he never faced the class." Unfortunately, what originated as a light-hearted dope sheet for the use of students has, at the hands of university and college administrators, turned into an instrument of unwarranted and unjust termination for large numbers of junior faculty and a source of humiliation for many of their senior colleagues.

In the 1970s, schools started requiring faculty to get students to fill out and turn in teaching evaluation forms to the administration. Administrators soon discovered they had a weapon to use against 50 percent of the faculty: they could proclaim that the half of the faculty with below-average scores in each and every department were bad teachers. They have been at it ever since. When administrators say, as they often do, "We won't tenure Professor X or give Professor Y a salary raise because he or she has teaching evaluations that are below average," they are saying, in effect, that "below average" means bad.

We know of one administration that heroically enlarged the proportion of no-good faculty members to 90 percent by declaring that any junior faculty member who failed to achieve scores in the top tenth percentile could not be promoted. But most administrations are content to bad-mouth a mere 50 percent. (If the "average" administrators use is the median, then exactly half of the faculty will be labeled bad. If they use the mean, the proportion labeled bad will probably be slightly above or below half.)

These administrators treat relative position as if it were an absolute measure of merit. They do not allow for the possibility that some departments will have mostly good teachers, in which case some or even all of those with below-average evaluations will be good teachers. They also do not envision departments in which most of the teachers are poor, in which case some or all of those with above-average evaluations may be poor teachers. It is simply incorrect to assume that each department is half and half, or that a whole university is half and half. A faculty member who gets ratings that are well below average is unlikely to be a shining star of teaching, but he or she may be quite good, valuable to the department and the students, and worthy of tenure and a decent salary.

Administrators who would like to achieve a faculty in which everyone is above average should move to Lake Woebegone, the only place where such a thing is possible. In everyplace else, if all those who were below average were fired, the average would simply rise, and about half the previously "good" teachers would then be below the new average, miraculously reborn as "bad" teachers.

One might argue that administrations should give up using relative order, and instead fix on some particular student evaluation score as the borderline between adequate and inadequate teaching. That would make sense if the ratings actually measured teaching effectiveness, but there is evidence that they do not.

Stephen J. Ceci, a professor at Cornell University, devised an experiment to see what might affect student evaluations. He taught a developmental psychology course twice, the first time using his customary style. The second time, he covered the same material and used the same textbook, but made a big effort to be more exuberant, adding hand gestures and varying the pitch of his voice. He characterized the results as "astounding"-his ratings for the second class soared. The students even gave higher ratings to the textbook. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.