Although considerable research has been conducted in the arena of teacher effectiveness, important questions continue to persist: What is effective teaching? How may it be defined? How may it be measured? To date, educators and researchers have failed to reach agreement about clear-cut answers to these questions; indeed, consensus may not be possible. The answers undoubtedly are affected by a number of things, including the recognition that teaching effectiveness comprises multiple perspectives (Abrami, d'Apollania, & Rosenfield, 1997; Marsh & Dunkin, 1997) as well as such characteristics as the type of course, class size, student abilities, and grading practices (Abrami, d'Apollonia, & Cohen, 1990; Greenwald & Gillmore, 1997). Additionally, methodology may have a differential effect on findings. For example, according to d'Apollonia and Abrami (1997), reviews of multisection validity studies often reach different conclusions due, in part, to reviewers' biases.
The primary issues addressed in this study deal with how teaching effectiveness might be defined or, more accurately, whether it might be defined in multiple ways, and if so, how it might be defined. For example, does the multidimensionality of the construct dictate multiple definitions, or might a single definition be able to capture it? We have addressed these matters in this study from the standpoint of student-rated teacher effectiveness and with methodology that, though somewhat unconventional, was selected in an effort to maximize interpretability of results by magnifying relationships and at the same time minimizing measurement error. Our approach was to ask students to rate instructors from whom they had taken a college or university course in the recent past, using high-inference items garnered from the teaching effectiveness literature that have shown to be strong correlates of teaching effectiveness. Prior to conducting their ratings, the students were provided with a brief training and question/answer session that dealt with avoidance of rating errors and clarification of items, procedures, the referent, and the like. The data they provided us were analyzed to see if multiple definitions of teacher effectiveness were indicated.
A large segment of the teacher effectiveness literature describes investigations that seek identifying characteristics, factors, traits, classroom behaviors, and so on, of effective teachers by rating instruction; student ratings, in particular, have received a great deal of attention. The validity of student ratings has been thoroughly analyzed and generally supported in the literature during the past 25 years (Centra, 1994; Cohen, 1981, 1987; Feldman, 1989; Marsh, 1987; Marsh & Bailey, 1993). Indeed, Greenwald (1997) suggests that reviews of research conducted since about 1980 indicate overwhelming evidence supporting the construct, convergent, discriminant, and consequential validity of student ratings.
Student ratings of instruction have been found to correlate highly with instructor personality traits (Feldman, 1986; Murray, Rushton, & Paunonen, 1990; Renaud & Murray, 1996). The Dr. Fox experiments of the 1970s (Marsh, 1987; Naftulin, Ware, & Donnelly, 1973) illustrated that students rated charismatic and expressive instructors as highly effective, regardless of the substantive content of a lecture. Murray et al. (1990) correlated peer ratings of personality traits with student ratings and found that personality traits differed among course types. Renaud and Murray (1996) also investigated relationships between instructor personality and student ratings but did not differentiate between course types. Their patterns of correlations were stronger than those found in the Murray et al. study, but could be due to restricted ranges in the earlier study.
Student achievement and student ratings have been found to be related (Cohen, 1981, 1987; Greenwald & Gillmore, 1997; Marsh, 1987). …