Generality is a prime goal of scientific inference because science depends on evidence from samples. In psychology, experimental results are typically obtained from a small sample of subjects tested in one narrow experimental situation, with a very small sample of stimulus conditions and usually a single task and a single measure of behavior. These results have value only to the extent that they generalize. No one is interested, for example, in those particular infant monkeys in Harlow's experiments on mother love; their behavior is of interest only insofar as it generalizes to other infants, especially human infants, across a wider range of test situations than Harlow used. The four kinds of generality implicit in the previous sentence are discussed on pages 20–24.
Reliability, or replicability, is one aspect of generality. Different samples of subjects will yield different results. Perhaps the effect observed in your sample is merely a chance accident of which subjects chanced to get into your particular sample. Any claim for a real effect should be prefaced by evidence that it is reliable—not likely to be produced by chance alone. Statistics can help assess reliability. Not less important, statistics can help you plan your experiments to get more reliability for less cost (pages 16–20).
Validity, which is far more important than reliability, is primarily an extra-statistical issue—which must be answered in terms of substantive knowledge. The ubiquitous threat to validity is confounding. Confounding arises because the experimenter employs some concrete stimulus manipulation that is intended to elicit a specified process. But this manipulation may also elicit some other process that undercuts the interpretation. The classic example of confounding is the placebo effect, in which the suggestion produced by giving a medicine has beneficial effects even though the medicine itself is worthless. Validity is more complex, however, as discussed on pages 8–16.
Generality, reliability, and validity are mainly extrastatistical problems. Statistics can furnish valuable assistance with some aspects of these problems, but effective use of statistics depends on integration of statistics with extra-statistical, empirical knowledge.
Six levels of knowledge are distinguished in the Experimental Pyramid (page 3). Each lower level is more important. Statistics, although mainly applicable at the top level, can also help at each lower level. Your labors will be more productive the more you learn how to integrate statistical inference into an empirical framework of extrastatistical, scientific inference. This empirical direction is the main theme of this initial chapter and of the entire book.