Academic journal article International Journal of Education

Using Generalizability Theory to Examine Classroom Instructors' Analytic Evaluation of EFL Writing

Academic journal article International Journal of Education

Using Generalizability Theory to Examine Classroom Instructors' Analytic Evaluation of EFL Writing

Article excerpt

Abstract

Using G-theory as a theoretical framework, this study was intended to examine the variability and reliability of classroom instructors' analytic assessments of EFL writing by undergraduate students at a Turkish university. Ninety-four EFL papers by Turkish-speaking students in a large-scale classroom-based English proficiency exam were scored analytically by three EFL raters. The results showed great rater variation. Ratings based on two assessment categories (e.g. communicative level and linguistic accuracy level) were also obtained. The variance component for scoring categories (c) did explain total score variance (7.25% of the total variance), suggesting that there was difference in the writing scores that could be attributed to the scoring category itself. Further, the dependability coefficient was .53 for the current scenario and even when the numbers of raters were increased to 10 the dependability of coefficient was .79. This difference had tremendous impact on the reliability of analytic scoring of EFL papers. The findings of this study provide evidence that the classroom teachers should be appropriately trained to score EFL compositions. Important implications are discussed.

Keywords: EFL writing assessment; rating variability; rating reliability; generalizability theory

(ProQuest: ... denotes formula omitted.)

1. Introduction

Research with English-as-a-second-language (ESL) and English-as-a-foreign-language (EFL) students has shown that the direct assessment is both complex and challenging (Barkaoui, 2008; Connor-Linton, 1995; Hamp-Lyons, 1991; Huang, 2007, 2008, 2009, 2011; Huang & Foote, 2010; Huang & Han, 2013; Sakyi, 2000). This is because multiple sources contribute to the variability of ESL/EFL students' writing scores. On one end, their age, first language, home culture, style of written communication, English proficiency, and the writing tasks can affect their writing performance to some extent (e.g. Hinkel, 2002; Huang, 2007, 2009, 2011, 2012; Huang & Foote, 2010; Huang & Han, 2013; Kormos, 2011; Kroll, 1990; Shaw & Liu, 1998; Weigle, 2002; Yang, 2001); on the other end, other factors such as essay features, scoring methods, raters' mother tongue, professional background, gender, experience, and type and amount of training can affect rater behavior and outcomes (Barkaoui, 2008; Brown, 1991; Cumming, Kantor & Powers, 2001; Huang, 2007, 2008, 2009, 2011; Huang & Foote, 2010; Huang & Han, 2013; Sakyi, 2000; Shi, 2001; Shohamy, Gordon & Kraemer, 1992; Weigle, 1994, 1999, 2002; Weigle, Boldt, &Valsecchi, 2003).

In the field of EFL/ESL writing assessments, the research has focused on improving consistency and accuracy of the ratings (Connor-Linton, 1995). The variability of the criteria of the raters can be counted as the outstanding source for these inconsistencies as some raters may look for the quality of content and some other may look for the organization (Weigle, 2002) and some others may consider these text feature differently based on proficiency level of essays (Cumming, 1990; Shi, 2001), in this sense, another essay-rater interaction regard raters' varying background such as composition teaching, rating experience, cultural background, training, and expectations and these variables can be very influential in determining scores on writing tasks (Weigle, 2002). This study examines the variability and reliability of EFL writing assessment using generalizability (G-) theory rather than classical test theory (CTT) and attempts to extend the knowledge base by examining undergraduate EFL students' of writing samples at a Turkish university.

2. Literature Review

Much research in ESL/EFL writing assessments examines scores assigned by markers or raters to investigate validity and reliability of tests and scores, and the evaluations. The CTT approach, the IRT approach (e.g., multi-faceted Rasch measurement), and the G-theory approach are the three theoretical frameworks that are used to address variability and reliability issues in the assessment of ESL/EFL writing (Huang, 2007). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.