Academic journal article Journal of College Science Teaching

When Wrong Answers Receive Top Grades

Academic journal article Journal of College Science Teaching

When Wrong Answers Receive Top Grades

Article excerpt

Imagine you are grading a test on fractions. How many of a possible 5 points would you give the student who produces the following answer?

16/64 = 16/64 = 1/4

How confident are you that you graded the answer correctly? Would you be less confident if you learned that 65% of your colleagues gave the student a different grade? The data we collected shows that no matter how many points you awarded the answer above, at least 65% of your colleagues believe the student deserved a different grade. At a time when students are increasingly forced to prepare for or take high-stakes tests because of No Child Left Behind (NCLB; 2002), it is imperative that the education community come to a consensus about what we are looking for when we evaluate assessments and attempt to assure consistency across different graders (American Educational Research Association [AERA], 1999).

To find out whether the education community shares a collective understanding about how students should be evaluated, we surveyed 202 educators (from all grade levels) and scientists attending assessment workshops (Pennsylvania, California, and Massachusetts) or judging a national student competition (Washington, DC). The educators and scientists graded hypothetical student responses to trivial math problems with definitive answers. The graders were first asked to grade one problem for which the instruction to the hypothetical student was "simplify the fraction" and then asked to grade the same problem when the new instruction was "simplify the fraction and show all work." The graders were then asked to grade three problems for which the instruction to the hypothetical student was "simplify the fraction" and then asked to grade the same three problems when the new instruction was "simplify the fraction and show all work." Depending on the person grading the question, the same student answer received anywhere from no points to full credit. When the instructions preceding a question changed, the graders often changed how they evaluated the students, even though the evidence about what the students knew remained the same. After the instructions changed, some graders awarded more credit for three wrong answers than for three right answers. The graders shared no consensus about how student answers should be graded. If students are going to be evaluated using tests, the education community must create tighter rubrics that ensure a higher degree of interand intragrader reliability.

Is there intergrader reliability in the education community?

If a test has intergrader reliability, the same student answer would receive the same number of points regardless of who grades the test (Heubert & Hauser, 1999). To test whether there is intergrader reliability in the education community, we first asked 202 scientists and educators to assign a grade of 0, 1, 2, 3, 4, or 5 to the following student answers.

ANSWER of Student A:

16/64 = 1/4

ANSWER of Student B:

16/64 = 16/64 = 1/4

Over 77% of the graders awarded Student A full credit for this answer. The answer is mathematically correct, but a significant minority of the graders decided that the answer did not deserve full credit. These graders all agreed that the student performed well--nobody disputed that 1/4 is the correct answer--but some wanted to see more proof that the student used a correct procedure before giving full credit.

When there was some evidence that a hypothetical student used an improper procedure to arrive at the answer, there was even less intergrader agreement. By crossing out the 6s, Student B's answer suggests that he improperly "cancelled" the 6s and only arrived at the correct solution by chance. Over 20% of the graders reacted so negatively to Student B's cancelling marks that they gave the student no credit for the correct answer. Another 32% of the graders were not at all troubled by the evidence and gave the student full credit. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.