Regression Line (2 of 2)
The formulas for b and A are:
b = r sy/sx and A =
My - bMx
where r is the Pearson's
correlation
between X and Y, s
y is the
standard deviation of Y, s
x is the standard deviation
of X, M
y is the
mean of Y and
M
x is the mean of X.
Notice that b = r whenever s
y = s
x. When
scores are
standardized, s
y = s
x = 1, b = r,
and A = 0.
For the example, b = 1.8, A = 1.2 and therefore, Y' = 1.8X + 1.2.
The first
value of Y' is 4.8. This was computed as: (1.8)(2)+1.2 = 4.8. The previous
page stated that the regression line is the best fitting straight line through
the data. More technically, the regression line minimizes the sum of the squared
differences between Y and Y'. The third column of the table shows these differences
and the fourth column shows the squared differences. The sum of these
squared differences ( .04 + .36 + .04 + .36 = .80) is smaller than it
would be for any other straight line through the data.
X Y Y' Y-Y' (Y-Y')²
2 5 4.8 .2 .04
3 6 6.6 -.6 .36
4 9 8.4 .6 .36
5 10 10.2 -.2 .04
Since the sum of squared deviations is minimized, this criterion
for the best fit is called the "least squares criterion." Notice
that
the sum of the differences (.2 -.6 + .6 -.2) is zero.