Academic journal article Journal of Theory Construction and Testing

Observations on the Usefulness of Null Hypothesis Testing

Academic journal article Journal of Theory Construction and Testing

Observations on the Usefulness of Null Hypothesis Testing

Article excerpt

Abstract: Nursing has long relied on classical null hypothesis significance testing in establishing effects within the broad enterprise of theory testing. The usefulness of significance testing has been debated throughout its history and the debate has distilled to vociferous proclamations that the null hypothesis, Ho and the p-value given by significance testing are inept and scientifically irrelevant. Serious intentions to eliminate Ho and traditional Ho testing have appeared with recommendations for focus on effect size and clinical relevance instead. Careful scrutiny reveals uncertainties in these indictments and in exclusive attention to effect size. Another criticism is about the convention of rejecting or not rejecting Ho based on p (.OS regardless of effect size. This reproach has meritorious implications for practice. The specific incriminations against Ho testing must be examined before the debate is considered settled and nursing science joins in doing away with significance testing. While clinical relevance of an effect is a crucial aspect of asserted support for a research hypothesis, an observed sample effect should also be considered in view of a sampling error hypothesis. "Significance testing uses a mathematical simulation tool for addressing the sampling error explanation of observed effects, while judgements about "significance" are subjective convictions.

Keywords: null hypothesis, significance, effects, clinical relevance

Revisiting the longstanding debate about classical null hypothesis significance testing is timely. Decades of criticism about the mathematical basis and interpretation of classical testing have resolved to unmistakable considerations that the classical approach should be abandoned (Cohen, 1990; Kirk, 1996; Schmidt, 1996; Tukey, 1991). Critics claim the following: (1) The mathematical null hypothesis of "no effect" is essentially trivial; (2) the testing probability or p-level is inept and unusable, making classical test interpretation illogical; (3) sample effects that might be clinically meaningful are bound to be discarded if not found to be "statistically significant" by the assumed trivial criterion. These allegations serve as motivation for "doing away with null hypothesis significance testing" (Kirk, 1996, p. 756), to be replaced with immediate consideration of effect size and clinical relevance of findings per se (Kirk, 1996; LeFort, 1993).

Nursing science relies on classical significance testing, making it imperative that these specific allegations be examined. The purpose of this writing is to thoroughly consider the logical rigor, empirical bases, and implications of these perennial incriminations against classical hypothesis testing in determining whether the claims are an unassailable and sufficient basis for abandoning the classical approach. Discernible flaws in some criticisms may suggest that abandoning the classical approach is unwarranted. The working assumptions of this exposition are that (1) researchers must consider the null or sampling error hypothesis as competing with substantive theoretical explanations of observed effects such as mean differences and correlations, and (2) hypothesis support is an expression of belief, not an imperative of mathematical operations applied to sample findings.

The Null Hypothesis in Context

Research designs host non-zero ettects such as mean ditterences and correlations, but only the flawless design-as it controls all possible threats to internal validity-yields an effect that is interpretable vis-a-vis the research hypothesis, H^sub R^, and the sampling error hypothesis, H^sub 0^. When experimental failure and contamination are suspected of having produced an effect, wholly or in part, H^sub R^ cannot be assessed and H^sub 0^ is false by default and therefore trivial. H^sub 0^ is a formal assertion or presumption that sampling error produced "the entire" effect, and this possibility is voided in view of experimental or survey technique flaw which may have produced any portion of the effect. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.