Practical methods for working out two-variable correlation and regression problems
Terms to Be Used. The preceding discussion has developed the means by which values of one variable may be estimated from the values of another, according to the functional relation shown in a set of paired observations. Simple correlation involves only the means for making such estimates, and for measuring how closely those estimates conform to, and account for, the original variation in the variable which is being estimated, for the given set of observations.
The regression line is used, in statistical terminology, to designate the straight line used to estimate one variable from another by means of the equation
Y = a + bX
This equation is termed the linear regression equation; and the coefficient b, which shows how many units (or fractional parts) Y changes for each unit change in X, is termed the coefficient of regression.
Where a curvilinear function has been determined, either by the use of an equation or by graphic methods, the corresponding curve is similarly designated as the regression curve. Either the mathematical equation or, if none has been computed, the expression
Y = f (X)
where the symbol f (X) stands for the relation shown by the graphic curve, is termed the regression equation.
The coefficient of correlation and the index of correlation have both been defined as the ratio of the standard deviation of the estimated values of Y to the standard deviation of the actual values, whereas the standard error of estimate has been defined as the standard deviation of the residuals