Over the past several years, student evaluations of instructors have taken on increasing importance as administrators have sought to use as many objective measures as possible to justify tenure/promotion decisions, to differentiate pay raises, and to provide feedback to the public. Student evaluations have been convenient tools because they produce numbers that appear to represent objective assessments of the effectiveness of the instructor. Student evaluation instruments are usually administered during a class session late in the semester and are relatively cheap and easy to administer. Also, such evaluations give the students the appearance of having input into decisions that relate to the quality of educational services offered.
Many instructors have expressed concern about how the data collected will be interpreted. Traditionally, the instructors have preferred that ratings be used for personal improvement rather than as administrative evaluations of the ability and effectiveness of the instructor. Issues of variation in ratings even in the same courses have been noted for some time. For example, Greenwald (1995) reported that his ratings for a course taught one semester placed him in the highest 10 percent of faculty ratings at his university. The next semester, he taught the same class by the same plan but student ratings placed him in the second lowest decile of the university faculty. How could such variation occur assuming similar efforts on the part of the faculty member?
Administrators often appear to subscribe to the view that "student ratings tend to be statistically reliable, valid, and relatively free from bias or the need for control; probably more so than any other data used for evaluation." (Cashin, 1995). Studies such as that by Marsh and Hocevar support the view that student evaluations of teaching performance are stable and consistent over time based on data from 6, 024 classes and 195 instructors.
Given that student evaluations are being used extensively to make administrative decisions, it becomes increasingly important to understand variation and in particular to ascertain whether any systematic bias occurs when students evaluate instructors. For many years, female faculty members have addressed the issue of gender bias in higher education. Concerns about hiring practices, pay differences and other issues have been widely addressed (see Kane, 1998; Colon, 1998); however, very little research appears to have been done in assessing whether students have a systematic bias in regards to the gender of the course instructor. Daufin (1995) declared that "research says students judge white women and people of color more harshly than white male professors" but offered no evidence of such research. Chandler (1996) cited a 1975 study that reported that female students have higher regard for female instructors.
Bennet (1982) reported that his research showed that both male and female students placed greater demands on female instructors for student contact and support, but found no evidence of direct gender bias on teaching evaluations. Basow and Silburg (1987) reviewed evaluations from 553 male students and 527 female students using multivariate analysis of variance. Based on the data, they concluded that female professors were rated lower than male professors on the issue of instructor interaction with the individual student by both male and female students with some differences between disciplines. Then Basow (1992) completed a more in-depth study that included four semesters of data from a private liberal arts college. Her results show that there was no significant effect of student gender on the ratings of male instructors, but female students rated female instructors significantly higher than did than male students.
The focus of this paper is to evaluate the student evaluations of instructors of business at a small regional university over a period of fourteen semesters. …