ACCURACY AND EFFICACY OF DIAGNOSES
Chapter 4 suggests that the discrimination acuity of diagnostic systems in any field---for example, in medicine, weather forecasting, information retrieval, materials testing, polygraph lie detection, and aptitude testing--can best be measured by ROC analysis and can conveniently be referred to the same Az scale. In all cases, one wishes to neutralize a variable decision criterion and to achieve results that are also independent of the varying prior probabilities of the events or conditions to be discriminated. This chapter shows typical values of Az in the several diagnostic fields mentioned.
Measuring performance in practical diagnostic applications is necessarily done with less precision than in psychological tasks simplified for the laboratory. In general, the "truth data" against which responses are scored are less than perfect and the samples of cases selected for a test may be biased in some manner or simply less than fully representative of natural variation. As described in this chapter, the diagnostic fields vary widely in the extent to which test samples can be obtained that are satisfactory with respect to these and related factors.
Chapter 5 suggests that signal detection theory offers the science for choosing the best decision criterion (or "decision threshold") in diagnostic practice. If the benefits and costs of the four possible decision outcomes considered--true positives, false positives, true negatives, and false negatives--can be determined or assigned, and if the prior probabilities of the positive and negative