The Legal System's Use of EPIDEMIOLOGY: Some clarifications/The Authors Respond
Korzeniewski, Steven James, Bryant, Arthur H., Reinert, Alexander A., Judicature
In their Judicature article "The legal system's use of epidemiology" (July-August 2003), Arthur Bryant and Alexander Reinert critique the judicial system's interpretation of population-based research, stating, "...the judicial system has failed to apply accurately scientific knowledge, and particularly epidemiology, in the courtroom." The proposed aim of the article was to "improve the means by which law relies on scientific disciplines." However, it contains several potentially harmful misinterpretations of statistical concepts and demonstrates a lack of understanding regarding the relevance of epidemiology to etiologic hypotheses and the consensus of the scientific community.
Thus, several clarifications are necessary if judges are to gain a full understanding of the concepts the authors attempted to convey. In particular, this article is intended to clarify misinterpretations and misunderstandings regarding the denial of the ability to conduct significance testing with confidence intervals; the confusion of a p-value with an alpha level; and the contention that preference toward epidemiological evidence when attributing causation is "against the weight of history" and not supported by the scientific community.
I agree with Bryant and Reinert that the label of "statistical insignificance" should not bar experts from testifying about particular studies, although I also believe judges are correct in barring testimony asserting causal hypotheses based solely on single studies, significant or not, because this in no way would be accepted within the scientific community and would violate even the former Frye rules of admissibility. Thus, I question Bryant and Reinert's interpretation of judicial actions barring testimony based on significance because the context of those actions are not elaborated on within their article. I did not find any objective support of their assertion that judges were viewing studies/testimony only in relation to significance, or the absence thereof.
Furthermore, Bryant and Reinert express disapproval of courts labeling study results "statistically insignificant" when their confidence intervals included "one." But this label is absolutelyjustified regardless of the effect size because statistical significance is defined in relation to a confidence/significance level. It is impossible to say an association is statistically significant when a significance or confidence level has been violated.1 Statistically insignificant results may still provide indications of a true association, although it is important to understand that the probability of the association being derivative of chance when the null hypothesis is true violated a pre-specified limit (alpha).
The authors were most likely attempting to convey a long-standing controversy in epidemiology between statistical and clinical significance. Statistical significance merely indicates whether a result is derivative of chance given the null hypothesis at a specific level of confidence. On the other hand, clinical significance is defined in Last's Dictionary of Epidemiology as "a difference in effect size considered by experts to be important in clinical or policy decisions, regardless of the level of statistical significance."2 Granted, clinically significant results are not always statistically significant. However, judges are absolutely correct in characterizing study results as statistically insignificant when a pre-specified limit on random error has been violated. Significance testing always requires a pre-specified significance/confidence level, or alpha. The use of an alpha level does not preclude the ability to further analyze information captured by the pvalue and/or confidence interval; it merely puts the p-value and/or confidence interval into the context of the probability of those results being derivative of chance alone when in reality there is no association. Thus, judges should not be criticized for using a pre-specified cut-off for false positive results when in fact this practice is mandated by the act of significance testing. …