The Rating Experiment
and Empirical ROCs
In the last two chapters, we described correspondence experiments in which people report which of two events (such as seeing a New or Old face) had occurred. According to detection theory, they do this by comparing the strength of evidence, which we called familiarity, with a criterion. Observations of more than criterial familiarity are called “old, ” and those below criterion are called “new. ” The criterion is placed at a location of the observer's choice: Strict criteria serve to minimize false alarms, lax criteria to minimize misses.
If observers can set different criteria in different experimental conditions, they must know more about events in their experience than is needed to make a simple yes-no judgment. In this chapter, we see how observers can make graded reports about the degree of their experience by setting multiple criteria simultaneously. Our two primary examples are both tests of recognition memory, but for rather different materials: odors and words.
How is memory for odors affected by the passage of time? Rabin and Cain (1984) presented participants with 20 odors to remember, then tested them at a delay of 10 minutes, 1 day, and 7 days. At each test, a different set of 20 New odors was intermixed with the Old stimuli.
Observers labeled each smell as “old” or “new” and also rated their confidence in these answers on a 5-point scale, which we have reduced to a 3-point scale for illustrative purposes. Thus, there are two kinds of stimuli