The assessment of writing has long been considered a problematic area for educational assessment professionals. As stated by Speck and Jones (1998), "there are more problems than solutions problems of inter-grader reliability, single-grader consistency, and ultimate accountability for the grades we assign" (p. 17). Due to the different linguistic and cultural backgrounds of English-as-a-second-language (ESL) students, the assessment of their English writing is more problematic than the assessment of NE students' writing (Connor-Linton, 1995; Hamp-Lyons, 1991a; Sakyi, 2000; Sweedler-Brown, 1993). On the one hand, many factors affect ESL students' writing, including their English proficiency, mother tongue, home culture, and style of written communication (Casanave & Hubbard, 1992; Hinkel, 2003; Shaw & Liu, 1998; Yang, 2001). On the other hand, raters may differentially consider these factors when rating ESL students' writing. Empirical studies have noted differences in rater behavior for ESL students (Bachman, 2000). Rater background, mother tongue, previous experience, amount of prior training, and types and difficulty of writing tasks have been found to affect the rating of the written responses of ESL students (Brown, 1991; Kobayashi, 1992; Sakyi, 2000; Santos, 1988; Weigle, 1994, 1999; Weigle, Boldt, & Valsecchi, 2003). The impact of these factors leads to questions about the accuracy, precision and ultimately, the fairness of the scores obtained from the ratings of written work produced by ESL students.
Fairness is a priority in the field of educational assessment. Educational organizations, institutions, and individual professionals should make assessments as fair as possible for test takers of different races, genders, and ethnic backgrounds (American Educational Research Association, American Psychological Association, National Council on Measurement in Education, 1999; Cole & Zieky, 2001; Joint Advisory Committee, 1993). Due to a significant growth in the number of ESL students being educated in North American schools in the past two decades (CBIE, 2002; IIE, 2001), fairness issues in ESL writing assessments have been of growing interest and importance (Connor-Linton, 1995; Hamp-Lyons, 1991a; Kunnan, 2000; McDaniel, 1985; Sweedler-Brown, 1993; Vaughan, 1991). Increasingly, writing-proficiency standards are being established for both secondary school and university students in North America without regard to students' native languages (EQAO, 2002; Johnson, Penny, & Gordon, 2000; Thompson, 1990). ESL students have to write the same tests as native English (NE) students. Like NE students, ESL students are expected to successfully demonstrate their English writing skills or complete high-stakes essay examinations (Casanave & Hubbard, 1992; Hayward, 1990, Wiggins, 1993).
However, research shows that ESL students face considerable challenges passing institutional or provincial/state competency examinations of writing (Blackett, 2002; Johns, 1991; Ruetten, 1994; Thompson, 1990). Further, these challenges may be due to more than language deficiencies. As an example, leniency or severity within raters could underestimate or overestimate ESL students' performance on these writing examinations. Previous studies have also found that raters with different teaching experience assign different scores to the same piece of ESL writing (Cumming, 1990b; Hamp-Lyon, 1996; Rubin & William-James, 1997). Such variability due to raters, therefore, may threaten the fairness of assessments of ESL writing.
Assessing and evaluating ESL writing involves both assigning a score or grade to an essay and commenting on it (Perkins, 1983). This review paper discusses only the scoring of ESL essays. Therefore, terms "grading", "rating", and "scoring" are used interchangeably in this paper, referring to the process raters use to arrive at the scores students will receive (Speck & Jones, 1998). …