Cross-classification and the analysis of variance
Introduction. In earlier editions, cross-classification and averaging was presented as a method of analysis that stopped short of formal multiple regression but embodied the basic idea of net regression lines or curves. In the present edition the concept of net regression has already been presented on a mathematical basis; the "drift lines" of the short-cut graphic method have also been presented as approximations to the final net regression curves. When large numbers of observations are available, the idea of studying the relationship between X1 and X2within each of a number of subclasses of X3 is intuitively obvious. Most data of the census type are presented in the form of cross-tabulations, so that frequencies and average levels of a "dependent" variable can be obtained for different combinations of values (usually classes or ranges) of two or three "independent" variables. Average incomes of workers cross- classified by age and years of schooling would be an example. These are all quantitative factors. In addition there may be non-quantitative subdivisions--the first three variables may be available for male and female workers separately and for each state or region.
Sometimes simple methods are overlooked, so the first section of this chapter will present an example of analysis by means of cross-classification and averaging. The remainder will present some basic principles of the analysis of variance and discuss its relationship to regression analysis. While analysis of variance has been developed primarily in connection with agricultural and biological experiments, its use has spread over the whole range of experimental sciences including, in recent years, some applications to the social sciences. For example, the formulas used in Chapter 22 to estimate the effects of a qualitative independent variable are based upon variance analysis concepts.