and Sorkin ( 1985). They modeled a task in which the operator made an observation and then a detection decision about the existence of a dangerous condition after being alerted by a warning. The concern here has been for how the PPV of the first detector affects observed behavior of the human who is expected to respond immediately to a warning. To the extent that the human responds unreliably at some values of PPV, attention should be focused on the performance of the automatic detector, as well as on the performance of the human.
What needs to be done? From my perspective, two systems issues have received less deliberate attention than they deserve. First, who is taking what kinds of considerations into account in setting the automatic detector's thresholds for issuing a warning? The engineer attempts continually to develop detectors of greater sensitivity but must work with what is available at a given time, and he or she must also accept the prior probabilities of a dangerous condition as being what they are. However, the decision threshold is a powerful variable that requires and rewards very explicit concern. We saw, for example, in Figure 2 that PPV can vary from near 1.0 to as little as .1 for a realistic prior probability, .001, as the FPP varies below .01. The tendency observed and remarked on, to set the threshold to achieve a very high TPP without sensitive awareness of the corresponding values of FPP and PPV, can undermine warning effectiveness to a large extent and perhaps completely.
The second question is a repeat of the first in a setting (e.g., the aircraft cockpit) in which there are multiple automatic detectors (e.g., a dozen or more). When the thresholds of several automatic detectors are each set primarily to avoid a miss, the overall PPV of the system can be very low indeed. A procurement process in which several independent manufacturers set thresholds to meet a specified, very high TPP for their own detector--with an unspecified effect on its FPP and, hence, on the total system's PPV--is clearly not going to be adequate.
The science exists for choosing the best decision thresholds in high-stakes detection and diagnostic settings (Chapter 5). These quantitative procedures focus attention, in any given setting, on the data that need to be gathered on prior probabilities and on the judgments that need to be made about costs and benefits of relevant behaviors--and they show how to combine this information optimally in selecting one or more thresholds. The discipline and assistance these procedures provide in designing and operating detection systems remain to be applied to systems in which warnings of danger are critical.
Comstock, J. R., & Arnegard, R. J. ( 1992). The multi-attribute task battery for human operator workload and strategic behavior research ( NASA Technical Memorandum No. 104174). Hampton, VA: National Aeronautics and Space Administration, Langley Research Center.