A Comparative Analysis of Different Models Explaining the Relationship between Instructor Ratings and Expected Student Grades

Article excerpt

The widespread use of student evaluations to rate faculty has raised the question of whether high student evaluations can be gained simply through the process of faculty giving higher grades to students, or whether learning of students is a critical factor in such evaluations. Four different models were tested which represented different relationships between students= expected student grades and student evaluations of the quality of instructors, with and without student motivation, ability, and amount learned as potentially important variables. Evaluations from 119 students of four different instructors were used for the data set. Statistical tests of the alternative models indicated that a more complex model incorporating student motivation and ability levels as factors affecting student evaluations of instructors provided the best fit to the data. The fit was superior to that of a model using only expected grades and student evaluations of instructors, indicating that students= evaluations of faculty did not appear to be based solely on the grades students expected to receive. The complex model also fit the data better than a simpler model using only perceived amount learned, expected grades, and instructor ratings. For this data set, instructor ratings were not simply a junction of expected grades, or simply a function of perceived amount learned, but a function of motivation, ability, amount learned, and grades.

In most colleges and universities in the U.S., students have long evaluated the performance of their instructors at the end of academic terms (e.g., Harrison, et aL 2004; Magner, 1997, Smith, 2004). Results of these evaluations are frequently utilized by individuals involved in personnel processes as a key criterion in making tenure and promotion decisions (Ehie & Karathanos, 1994; Harrison, et al., 2004; Smith, 2004; Williams & Ceci, 1997).

After extensive reviews of the literature, Marsh (1987) and ElHs, et al., (2004) found numerous studies that had reported positive relationships between the grades students expected to receive in classes and student ratings of instructors. Marsh (1987) and ElHs, et al. (2004) noted that, to the extent that this positive relationship may reflect grading leniency independent of other instructional attributes, such assessments might lack utility in measuring teaching effectiveness. However, Marsh and ElHs, et al. also noted that valid student evaluations could exhibit this same relationship, if, in fact, more effective teaching resulted in both higher expected grades and higher instructor ratings.

A number of other researchers have also examined this issue. For example, following an extensive examination of written comments on student evaluations, Trout (1997) specifically noted that level of course rigor appeared to be negatively associated with student ratings of instructors. Greenwald and Gillmore (1997) also concluded that ratings of instructors were affected by grading leniency, and described a statistical method that could be used to remove such contamination. Similarly, ElHs, et al. (2004) found evidence that the average student grade given in a course was a significant predictor of average student ratings of instructional quality of that course, and also suggested a need for adjusting student evaluations based on grades for a class. Krautman and Sander (1999) also found that high grades were related to higher teaching evaluations, and noted that such evidence indicated that such evaluations were a flawed measure of teaching performance. Similarly, McKeachie (1997) noted that care should be taken in how student ratings of instructors are utilized for personnel decisions, because, in some cases, higher grades may be given by instructors in an attempt to produce more positive teaching evaluations. Crumbley. et al.. (2004) found that student evaluations might have encouraged a lack of rigor in the classroom on the part of instructors in order to gain higher evaluations. …