Null Science: Psychology's Statistical Status Quo Draws Fire
Bower, Bruce, Science News
Geoffrey R. Loftus, a psychologist at the University of Washington in Seattle, experiences "a certain angst" about his discipline these days. Over the past 30 years, lie has built a successful scientific career and now edits the journal Memory and Cognition. From this lofty vantage point, Loftus sees with dismay a research landscape dotted with dense stands of conflicting data that strangle theoretical advances at their roots.
Findings reported by one set of investigators often fail to hold up in independent studies and rarely lead to breakthrough models of flow minds work, Loftus remarks. This conceptual muddle, in his view, reflects a deeply flawed approach to doing science. Most researchers strap on a statistical straitjacket that offers enough flexibility to fire off publishable rounds of data but prevents anyone from heaving any thunderbolts of psychological insight.
"What we do, I sometimes feel, is akin to trying to build a violin using a stone mallet and chain saw," Loftus says. "The tool-to-task fit is not very good, and we wind up building a lot of poor-quality violins."
Loftus' musical analogy resonates deeply with many psychologists. In fact, a growing number openly criticize what they see as their field's statistical shortsightedness. Discontent has focused particularly oil the practice known as null hypothesis testing, or significance testing.
In a significance test, the investigator typically gathers data to test the prediction that key experimental measures bear no relationship to one another. For example, such a null hypothesis might posit that the average amount and intensity of behavior problems in two groups of children occur independently of the presence or absence of marital distress in the youngsters' families.
Psychologists usually hope to reject the null hypothesis, based on a 5 percent or lower level of significance. Many see is level as indicating that they will be wrong no more than 5 percent of the time when they claim that two conditions are linked. Such significance levels signal to them that the measured variables probably do bear a relationship to one another.
At that point, researchers engage in a kind of 5 percent solution and proffer their favored explanations for a finding--say, by concluding that misbehavior mushrooms in children who grow up with battling parents.
Critics view this practice, and the assumptions underlying it, its unjustified. Significance testing simply establishes the probability of obtaining a certain data set, they argue, assuming from the start that the null hypothesis is true.
Thus a 5 percent significance level in the study described above indicates to them that an errant statistical link between children's misbehavior rates and discord in their parents' marriages would occur only 1 in 20 times, if these variables indeed operate independently. Front this perspective, significance levels--no matter how low they go--say nothing about the likelihood of any proposed explanation for statistically significant results.
"The shared secret of psychological researchers is that we don't take our own data too seriously when reaching theoretical judgments," contends John E. Richters, head of the disruptive disorders program at the National Institute of Mental Health in Rockville, Md.
"Even the brightest people use empirical research mainly to keep their careers going. When I talk to them in private, they express Much more sophisticated views about menial functioning than what you see in their published reports."
Nonetheless, utility of the same folks treat significance testing as a handy way to convert behavioral observations into objective scientific conclusions, notes psychologist Patrick E. Shrout of New York University In complementary fashion, peer reviewers and editors at toll journals routinely reject papers that do not boast significance, levels of 5 percent or lower. …