1. Introduction
In spite of disagreement over consequences regarding their positive and negative impact on teaching and learning, Student Evaluations of Teaching and Teachers (SET) have been widely used, particularly in tertiary institutions for years (Jones, 1989; Ory & Ryan, 2001). Thus, the feedback from students has become common and is regarded as valuable for many institutions. Studies have mainly focused on the factors which might have an impact on validity and reliability of student evaluations, defining three major categories: instructor-level, subject-level and student-level (Pozo-Munoz, Rebolloso-Pacheco & Fernandez-Ramirez, 2000).
Instructor- level factors are summarized as instructors' use of class time, availability outside class time, how well they assess student learning or understanding, concern for students' welfare and performance, the extent to which they emphasize analytical or critical skills, preparedness, and tolerance of alternative viewpoints in class (Pozo-Munoz et al., 2000). According to Boex (2000), presentation and organizational skills, clarity of expression, how the instructor uses grading and assignments, intellectual capabilities, the ability to interact well with students, and the ability to motivate students are also important instructor level factors. A number of studies also highlight the significance of instructor personality for SET, including expertise, ability to motivate, management of student behavior, level of excitement, interpersonal skills, showing a caring nature, being systematic, and showing respect for students (Brown & Atkins, 1993; Lowman & Mathie, 1993; Patrick & Smart, 1998). In addition to these, instructor's reputation was also found to influence ratings, unlike title and position (Boex 2000; Jacobs 2002; Murray, Rushton & Paunonen, 1990; Shevlin et al., 2000).
Subject-level factors in evaluations of students include the time of the lesson, whether it is elective or a prerequisite, the level, perceived difficulty, and the size of the class (Neumann, 2000). Although time of the day or year do not seem to have a great effect on evaluation, rating surveys given immediately after a final exam have been found to have less validity because of anxiety and fatigue. Regarding class size, it has been found that students in larger classes give lower ratings. In their extensive study, Davies et al. (2006) found that the class size did not have a significant effect on student rating in first and second year subjects, but had a negative effect in later year classes. Whether subject itself has an effect on evaluation or not has also been a focus of studies. It has been found that Humanities and Art type subjects receive higher ratings than Mathematics-type courses, as students tend to feel incompetent in quantitative skills (Braskamp & Ory, 1994; Cashin, 1990; Neumann, 2000).
Student-level factors include student biases, reasons of taking the course, the effort student expend in the subject, age, ethnicity, gender, and students' grade expectations. Studies on gender have yielded inconclusive results, as some support potential biases, while others indicate that gender has no effect on evaluation (Basow, 2000; Cashin, 1990; Feldman, 1993; Wolfer & Johnson, 2003). However, Mason, Steagall and Fabritus (1995) claim that females are more likely to give positive ratings of teacher effectiveness. Regarding age, no significant effect has been found on evaluation, however, Worthington (2002) claims that students over 30 are more likely to give lower ratings. Among all factors, student expectations seem to be the center of several studies. Focusing on grade expectations in his study, Marsh (1987) grounds his opinion on correlation between grades and student evaluations in three hypotheses 1- the leniency hypothesis, which is related with teacher's leniency in grading, 2- the validity hypothesis, which is related with the amount of knowledge students have gained and the favoritism that they show, by giving high rates, and 3- the prior characteristic hypothesis, which is related with particular student or course factors such as motivation or class size. Other factors arising from student, such as cultural background, thinking style, learning style, high grade expectation, and nationality have also been found to be positively correlated with instructor ratings, although they are not directly related to the instructor (Boex, 2000; Germain & Scandura, 2005; Zhang, 2004; Worthington, 2002). Underlying the importance of nationality, Worthington (2002) claims that students from a non-English speaking background expect higher grades and tend to give higher ratings (Millea & Grimes, 2002). Personality, mentioned as instructor level factor above, is also regarded as student level factor, since students tend to give higher rates to the teachers who are kind, funny, enthusiastic and entertaining (Feldman, 1993; Wilson, 1998).
In addition to the instructor, student and subject level factors mentioned above, the method of administration applied in SET is also important for reliability purposes. A great many studies shows that anonymity has an effect on students' evaluations of teachers, and that non-anonymous methods result in higher ratings than anonymous ones (Fries &McNinch, 2003). Davies et al. (2010) indicate that the purpose, the content and the type of surveys are important factors, and these vary from institution to institution, which might cause different interpretations. The purpose of evaluation makes a difference to students' ratings, some universities conduct surveys to collect qualitative data specifically on …