What's under the ROC? an Introduction to Receiver Operating Characteristics Curves

Article excerpt

It is often necessary to dichotomize a continuous scale to separate respondents into normal and abnormal groups. However, because the distributions of the scores in these 2 groups most often overlap, any cut point that is chosen will result in 2 types of errors: false negatives (that is, abnormal cases judged to be normal) and false positives (that is, normal cases placed in the abnormal group). Changing the cut point will alter the numbers of erroneous judgments but will not eliminate the problem. A technique called receiver operating characteristic (ROC) curves allows us to determine the ability of a test to discriminate between groups, to choose the optimal cut point, and to compare the performance of 2 or more tests. We discuss how to calculate and compare ROC curves and the factors that must be considered in choosing an optimal cut point.

(Can J Psychiatry 2007;52:121-128)

Information on funding and support and author affiliations appears at the end of the article.

Highlights

* ROC analysis is used to select the optimal cut point to dichotomize a continuous scale.

* The usual choice of cut points minimizes the overall number of false positive and false negative errors.

* The cut point may be shifted if the cost of false positives is higher than that of false negatives, or vice versa.

* The accuracy of ROC analysis depends on the quality of the gold standard, which is usually far from perfect in psychiatry.

* Changing the purpose of the test (for example, from diagnosis to screening) requires a shift in cut points.

* A cut point that is ideal for one group may be less than ideal for another.

Key Words: receiver operating characteristic, ROC, area under the curve, AUC, test performance, diagnosis, sensitivity, specificity

Abbreviations used in this article

AUC area under the curve

CAT computed axial tomography

CCHS Canadian Community Health Survey

CI confidence interval

FN false negative

FP false positive

pAUC partial area under the curve

PNP photonumerophobia

PPV positive predictive value

ROC receiver operating characteristics

SE standard error

SPNP Scale of Photonumerophobia

TN true negative

TP true positive

Those of you who have read this series of articles religiously know that, because of the tremendous loss of information incurred, you should never dichotomize continuous variables.1 Never! Nohow! Ever! Under no circumstances! Except, of course, when it makes sense to do so. One legitimate reason for dichotomizing occurs when a statistical test requires a linear relationship between variables (for example, multiple regression) but the actual relationship isn't linear. It then makes sense to dichotomize or trichotomize the predictor variable. A more common reason occurs when a dichotomous decision must be predicated on a continuous scale. For clinical or research purposes, it may be necessary to divide individuals into 2 groups-say, with or without depression-on the basis of an interview or a scale of depressive symptoms. This is done, for example, with the Center for Epidemiologic Studies Depression Scale,2 where the score can range from 0 to 60. Those who score 17 or more are deemed to suffer from depression; those with lower scores are classified as being without depression. The issue now becomes how we choose the cut point that best divides the sample into these 2 groups.

For historical reasons, the method that's used is called ROC analysis. The name dates back to World War II and the merging of signal detection theory with the development of radar. When the gain of the radar set (comparable to the volume control on a radio) is at zero, no signal (in this case representing an enemy plane) is detected. Increasing the gain lets more signals in, but it also increases the amount of noise that is picked up and possibly misinterpreted as a true signal. …