Academic journal article Psychonomic Bulletin & Review

The Disutility of the Hard-Easy Effect in Choice Confidence

Academic journal article Psychonomic Bulletin & Review

The Disutility of the Hard-Easy Effect in Choice Confidence

Article excerpt

A common finding in confidence research is the hard-easy effect, in which judges exhibit greater overconfidence for more difficult sets of questions. Many explanations have been advanced for the hard-easy effect, including systematic cognitive mechanisms, experimenter bias, random error, and statistical artifact. In this article, I mathematically derive necessary and sufficient conditions for observing a hard-easy effect, and I relate these conditions to previous explanations for the effect. I conclude that all types of judges exhibit the hard-easy effect in almost all realistic situations. Thus, the effect's presence cannot be used to distinguish between judges or to draw support for specific models of confidence elicitation.

(ProQuest: ... denotes formulae omitted.)

Confidence and its calibration are oft-studied topics in the decision sciences. The topics are relevant to many applied areas, including finance (Thomson, Önkal-Atay, Pollock, & Macaulay, 2003), meteorology (Murphy & Winkler, 1984), and eyewitness testimony (Wells, Ferguson, & Lindsay, 1981). Psychological research on confidence also has implications for the elicitation of prior distributions in Bayesian models (e.g., O'Hagan et al., 2006). This general applicability of confidence elicitation contributes to its popularity as a research area.

In the applications above, confidence is usually expressed as a probability: Given a single event, 0 expresses certainty that the event will not occur, 1 expresses certainty that the event will occur, and intermediate probabilities express intermediate levels of certainty. This is known to decision researchers as a no choice-100 (NC100) task (terminology from Ronis & Yates, 1987). In an alternative task, the choice-50 (C50) task, judges choose between two alternatives and then report confidence in their choice. Confidence is bounded at .5 and 1 because, if the judge's confidence is below .5, he or she should have chosen the other alternative.

Regardless of the task, researchers often examine a judge's calibration by comparing average confidence (...) over a set of events to proportion correct (...) over the same set. This results in the overconfidence statistic (OC):

... (1)

Judges are said to be well calibrated if OC = 0-that is, if their average confidence matches proportion correct. It is very common to find that OC . 0-that is, that judges are overconfident.

A second ubiquitous finding in confidence research deals with the magnitude of OC at different difficulty levels. This finding, termed the hard-easy effect, was described in detail by Lichtenstein, Fischhoff, and Phillips (1982; see also Lichtenstein & Fischhoff, 1977). They found that people tend to exhibit more overconfidence for hard sets of questions than for easy sets of questions. Across experiments or question sets, a hard-easy effect for the C50 task is displayed in Figure 1. Proportion correct is on the x-axis, overconfidence is on the y-axis, and each point represents a hypothetical experiment or question set. The points show the general hard-easy trend: As ... increases, OC decreases.

Many explanations have been advanced for the hard-easy effect, including insufficient placement of confidence criteria in a signal detection framework (Ferrell & McGoey, 1980; Suantak, Bolger, & Ferrell, 1996), random error (Erev, Wallsten, & Budescu, 1994), the insensitivity of confidence to task difficulty (Price, 1998; von Winterfeldt & Edwards, 1986), and cognitive bias (Griffin & Tversky, 1992). Whereas these explanations refer to within-judge factors, other researchers have proposed that the experimental design itself contributes to the hard-easy effect. For example, Gigerenzer, Hoffrage, and Kleinbölting (1991) showed that the biased selection of test questions can yield a hard-easy effect: If an experimenter chooses more trick questions than are usually found in some domain, for example, we might expect a judge's confidence to be artificially high and accuracy to be artificially low. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.