REGRESSION AND CORRELATION
Prediction is a strong point of regression analysis, already illustrated with prediction of success in graduate school in Section 1.5.1 (page 25). Regression analysis has valuable capability to utilize observational data, as in this example, not requiring experimental control. Multiple predictor variables can be teamed for better results with little more trouble than a single predictor. This simplicity contrasts with factorial Anova, in which each added variable expands the bulkiness of the factorial design. This usefulness of regression analysis reflects the empirical prevalence of linear trends with quantitative, metric variables.
With experimentally controlled variables, regression analysis can also be useful. Many experimental variables are metric: amount of reward, concentration of drug, length of word list, time intervals, and so forth. By utilizing the stimulus metric, regression can extract more information than factorial Anova. Simpler, nonfactorial designs also become feasible.
For causal analysis with observational data, regression has limited usefulness. A number of health problems, for example, have been traced back to their causes using observational data. Such results are more commonly presented as correlations, derived from the regression. Causal analysis is a weak point of regression with observational data, however, as shown by the well-known pitfalls of correlation.
In one-variable regression, the data come in the form of Y–X pairs, one pair for each subject: Y is the response measure; X is the predictor measure. Thus, X might be grade point average in college, and Y success in graduate school. The problem is to find a formula that uses information in X to predict Y.