The Ordinary Least Squares regression process minimizes the sum of the squares of deviations of actual data from the curve of a fitted equation by estimating an unconstrained zero-intercept and the coefficients of each independent variable. A powerful regression explains most of the variance of the dependent variable and shows only small unexplained residual variation of the dependent variable (the Error Sum of Squares). A weak regression explains little variance of the dependent variable, has a high Error Sum of Squares, and shows a low correlation coefficient. The correlation coefficient (R2) measures the proportion of the total variation in the dependent variable that is "explained" by the regression equation. The adjusted correlation coefficient, which is used in this analysis, improves the measure of "goodness of fit" to the data for the regression equation by accounting for degrees of freedom of the equation and minimizing the residual variance (rather than the variation) of the dependent variable.
A high adjusted correlation coefficient means that the unexplained variance (the difference between the actual data and the data predicted or explained by the regression equation) is small, and that the independent variables explain most of the variance of the dependent variable from predicted values. If the data include sufficient observations of independent variables that are near zero to allow statistically meaningful results, the zero-intercept can be interpreted directly as a reasonable estimate of the dependent variable when the independent variables are all zero. In such a regression, high "goodness of fit" should mean that the zero-intercept is comparable to the mean of the dependent variable.
In absence of sufficient observations of zero for the independent variables, the zero-intercept can only be interpreted in conjunction with the goodness of fit of the regression. It is easily shown 1 that the best estimate of the zero- intercept is the difference between the mean of the dependent variable (actual data) and the sum of the products of the estimated coefficients and the means of their respective variables (predicted data):
Ze = CNFm - (α1eV1m + . . . αneVnm).