The article addresses the problem of three types of significance of research results: statistical, practical and clinical significance. These issues are treated as chronological sequences in the evolution of how results of clinical research have been reported. For a long time, statistical significance was the only way of reporting research results. This method was subject to severe criticism showing that estimating the probability of results to be obtained by chance is not satisfactory from a clinically point of view. Statistical significance was followed by practical significance reporting, translated into size effect. Even though this change is a step forward, effect size says nothing about whether the intervention makes a real difference in the everyday life of the clients, or others whom the client interacts with. Thus, in recent years, the concept of clinical significance has been increasingly emphasized and operationalized most frequently by the quality of life, improvement in symptom level (improvement criteria), transition of patients from the dysfunctional to the functional distribution (recovery criteria) or a combination of them. Although this concept has also been subject to criticism, it has survived the debate and satisfies the set of criteria by which clinical research results are judged.
Keywords: statistical significance, practical significance, effect size, clinical significance, quality of life, reliable change, normative comparison, social significance
In the beginning there was statistical significance
For many years, statistical significance testing was the golden standard in analyzing data for many research domains, including clinical psychology and psychotherapy. This procedure proved so useful that even nowadays researchers are impressed by its potential. For example, Abelson (1997) asserts that "significance tests fill an important need in answering some key research questions, and if they did not exist they would have to be invented" (p. 118). Similarly, Harris (1997) states that null hypothesis significance testing (NHST) as applied by most researchers and journal editors can provide a very useful form of social control over researcher's understandable tendency to "read too much" into their data" (p. 145).
But what does statistical significance mean after all? Statistical significance refers to result significance tests, particularly the "p" value, which is the probability that research results (e.g., the difference between a control and an experimental group) be obtained when the null hypothesis is true (i.e., the two groups belong to the same population). Simply put, "the p value indicates the probability that observed findings occurred by chance" (Paquin, 1983, p. 38). In the context of clinical research, if we compare two types of therapies, A and B, statistical significance can prove if intervention A is better than intervention B.
In comparing an experimental group with a control group, the "p" value depends on three factors: 1. the magnitude of the effect (the performance difference between groups); 2. the number of observations or participants; and 3. the spread in the data (commonly measured as variance and standard deviation).
An important question for clinical research is if statistical significance testing is enough for the evaluation of intervention efectiveness and eficacy. If clinicians would only be interested in the superiority of one therapy over another, then yes, statistical significance testing would be sufficient. The problem is, however, that clinicians want to know more, such as how large is the outcome of a particular therapy?
Statistical significance does not provide information about the magnitude of change (i.e., effect size) or whether the relationship is meaningful (i.e., clinical significance), although sometimes researchers misinterpret statistically significant results as showing clinical significance. …