Every investigation with more than two conditions faces two linked hazards: escalation of false alarm and loss of power. Each comparison of two means is an opportunity for false alarm. Four conditions yield six two-mean comparisons—six opportunities for false alarms. Hence the effective α for this family of six comparisons is greater than the α used for each single test—not.05 but.20. This familywise α escalates rapidly with more conditions.
You can hold α down to whatever you want—at a price. The price is loss of power, that is, β increase. This is the α-β tradeoff dilemma of Chapter 2.
Two polarized philosophies have developed for dealing with this α–β trade–off dilemma. Each philosophy is mainly concerned with avoiding one of the two hazards. The familywise philosophy postulates that α should be set at some fixed value, regardless of the number of conditions. Necessarily, therefore, power decreases with more conditions. The per comparison philosophy, seeking to lessen such loss of power, allows larger α with more conditions.
The per comparison philosophy is advocated in this chapter for most work in experimental psychology. Current texts, in contrast, increasingly follow the familywise philosophy. But although some situations do require familywise analysis, these are not common in experimental psychology. The per comparison philosophy is founded on empirical common sense, as shown in the Parable of the Two Philosophies.
Extrastatistical considerations have an essential role with multiple comparisons. Among these are the guideline of planned comparisons and the principle of replication, which do much to provide reasonable control of α and β. This guideline and this principle are accepted by most empirical investigators.
Empirical judgment is the primary basis for handling multiple comparisons, not statistical techniques. Empirical judgment underlies the cost–benefit analysis necessary to deal with the α–β tradeoff dilemma. Above all, empirical judgment is needed in planning the design and analysis, especially in relation to the guideline of planned tests and the principle of replication. This chapter, accordingly, sets the issue of multiple comparisons within the framework of the Experimental Pyramid of Chapter 1.