Can Parapsychology Move beyond the Controversies of Retrospective Meta-analyses?/Est-Ce Que la Parapsychologie Peut Sortir Des Controverses Sur Les Meta-Analyses retrospectives?/?Puede la Parapsicologia Superar Las Controversias De Los Metanalisis retrospectivos?/Kann Die Parapsychologie Die Kontroversen Um Retrospektive Metaanalysen Uberwinden?

The field of parapsychology remains highly controversial and has not obtained the degree of acceptance and support that is needed. For the past 25 years, meta-analyses have been the foundation for the debates about the evidence for psi. This article focuses on the questions why have the meta-analyses been controversial and what can be done to move beyond these controversies?

Although the issues described here manifest in meta-analyses, the discussion covers much more than meta-analyses. Some of the key issues originate with the methodology and findings in the original experiments and must be addressed by appropriate new experiments. Also, alternative strategies for research synthesis may avoid some of the controversies associated with meta-analysis. Most of the final recommendations here do not involve meta-analysis.

The topics covered can be categorized as (a) intrinsic limitations of meta-analysis, (b) unfortunate experimental practices in parapsychological research, (c) problematic properties of the experimental findings in parapsychology, and (d) unfortunate meta-analysis practices in parapsychology. The combination of these factors has made parapsychological meta-analyses controversial. These categories interact, which requires that the same or similar topics are sometimes discussed under multiple categories.

This article does not attempt to comprehensively discuss all aspects of every issue. Some of the topics are controversial. The purpose here is to describe enough of the differing opinions to indicate practices that are not convincing if challenged.

Intrinsic Limitations of Meta-Analyses

The advent of meta-analysis in parapsychology in the 1980s was greeted with great enthusiasm. Small studies could be integrated to provide quantitative evidence for an effect and to evaluate potential moderating factors. Rosenthal (1986) and Utts (1986, 1991) argued that effect size was a more appropriate measure of replication than statistical significance. The usual practice of ignoring power analysis when designing experiments appeared to have good justification. Large studies were not needed. Meta-analysis was considered to provide the definitive evaluation of a line of research and to provide compelling evidence for psi. Broughton (1991) described meta-analysis as a "controversy killer."

However, this early optimism was not realized in practice. After noting cases when meta-analysis has been applied to controversial topics in psychology, Ferguson and Heene (2012) recently commented:

   [W]e have seldom seen a meta-analysis resolve a controversial
   debate in a field.... [W]e observe that the notion that
   meta-analyses are arbiters of data-driven debates does not appear
   to hold true.... [M]eta-analyses may be used in such debates to
   essentially confound the process of replication and
   falsification.... [F]ocusing on the average effect size may be used
   to, in effect, brush the issue of failed replication under the
   theoretical rug.... (p. 558).

The controversial debates noted in the article did not include parapsychology, but the comments aptly describe the experience with meta-analysis in parapsychology.

The limitations of meta-analyses were also apparent in medical research. Inconsistent or contradictory conclusions had been reached in different meta-analyses of the same database (Bailar, 1997). The statistical book most frequently used at a pharmaceutical company I recently worked with said the following: Our inclusion of [meta-analysis] in a chapter on exploratory analyses is an indication of our belief that the importance of meta-analysis lies mainly in exploration, not confirmation. In settling therapeutic issues, a meta-analysis is a poor substitute for one large well-conducted trial. In particular, the expectation that a meta-analysis will be done does not justify designing studies that are too small to detect realistic differences with adequate power. …