Regression Line (2 of 2)

previous
The formulas for b and A are:
b = r sy/sx and A = My - bMx
where r is the Pearson's correlation between X and Y, sy is the standard deviation of Y, sx is the standard deviation of X, My is the mean of Y and Mx is the mean of X.

Notice that b = r whenever sy = sx. When scores are standardized, sy = sx = 1, b = r, and A = 0.

For the example, b = 1.8, A = 1.2 and therefore, Y' = 1.8X + 1.2.

The first value of Y' is 4.8. This was computed as: (1.8)(2)+1.2 = 4.8. The previous page stated that the regression line is the best fitting straight line through the data. More technically, the regression line minimizes the sum of the squared differences between Y and Y'. The third column of the table shows these differences and the fourth column shows the squared differences. The sum of these squared differences ( .04 + .36 + .04 + .36 = .80) is smaller than it would be for any other straight line through the data.
 X     Y      Y'    Y-Y'    (Y-Y')²
 2     5     4.8     .2      .04
 3     6     6.6    -.6      .36
 4     9     8.4     .6      .36
 5    10    10.2    -.2      .04
Since the sum of squared deviations is minimized, this criterion for the best fit is called the "least squares criterion." Notice that the sum of the differences (.2 -.6 + .6 -.2) is zero.
previous