Do Students Know If They Answered Particular Questions Correctly on a Psychology Exam?
Rosenthal, Gary T., Soper, Barlow, McKnight, Richard R., Price, A. W., Boudreaux, Monique, Rachal, K. Chris, Journal of Instructional Psychology
The current study explores students' abilities to make different metacognitive judgments about the same material. Sophomores in a psychology class indicated how confident they were that each answer on their final was correct (micro-level judgments) and pre- postdicted their overall score (macro-level judgments). Students made the series of simpler micro-level metacognitive judgments "Did I answer a particular question correctly" more effectively than the more complex macro-level judgments pre- postdicting "What will be/was my overall exam score?" Data indicate that when assessing performance on the same exam, different metacognitive tasks produce different success rates. In addition, there was a significant tendency for students to assign higher confidence ratings to their correct answers, but no significant tendency to assign lower confidence ratings to their incorrect answers.
In a recent study Dunning, Heath and Suls (2004) examined self-assessment in a variety of settings. Their overall conclusion was that the relationship between people's self-ratings and their performance in many areas was "moderate to meager" (p.69).
Self-assessment is of value for students since it implies the possibility of improved performance. Falchikov and Boud's (1989) meta-study of self-assessment in higher education reported only a modest link between how students thought they would perform and how they actually did. In academia self-assessment involves metacognitive skills which affect how a person studies and learns. They include such skills as the ability to monitor learning, attention and memory. In a laboratory, subjects might study to-be-recalled items (e.g., paired associates) and predict which items they will later remember. Afterward, researchers compare the Judgments of Knowing (JOK) data to the actual test results.
Amore "ecologically authentic" instance of JOK occurs when students predict/postdict their exam scores. Student prediction and postdiction of test scores in college courses has a rich literature (Murray, 1980; Shaughnessy, 1979; Sinkavich, 1995; Sjostrom & Marks, 1992). Maki and Berry (1984) had students predict their performance on a multiple-choice textbook exam of 18 questions. Most students predicted with some accuracy immediately before the test. In addition, students scoring above the median accurately predicted scores up to 72 hours before the exam.
Balch (1992) had Introductory Psychology students pre- postdict the number of questions they answered correctly on a comprehensive 75-item multiple-choice final. Poor students' pre- and postdictions were overestimates; average students overestimated before the final, but not after. The best students accurately pre- and postdicted their scores. However, Balch required his students pre- postdict the number of items correct on his 75-item final. More typically, college exams are 50 or 100 items and results are reported as percentage correct. By asking for the number correct of 75 items rather than percentage correct of 50 or 100 items, Balch may have made estimating needlessly difficult (see Rosenthal et al.'s 1996 discussion of this and other possible problems).
Rosenthal et al. (1996) addressed some of the potential problems in Balch's (1992) design as well as assessed student awareness of the topics on their exam. They found better students were more accurate at postdiction of overall scores and were more aware of what topics had appeared.
Rather than having students pre- postdict their overall exam scores, Sinkavich (1995) had students rate their confidence in each of 209 multiple choice answers on three exams.
Ratings were based on a five-point Likert scale reported as "not correct (-2)" to "correct (+2)" with a midpoint interpreted as "maybe it is correct; maybe it is not correct (0)." He also provided them with feedback as to their rating accuracy on the first two tests. …