Partitioning the Sums of Squares
Next section: Confidence intervals and significance tests
The sum of the squared deviations of Y from the mean of Y (Y
M)
is called the sum of squares total and is referred to as SSY. SSY can be
partitioned into the sum of squares predicted and the sum of squares error.
This is analogous to the partitioning of the sums of squares in the
analysis
of variance.
The table below shows an example of this partitioning.
X |
Y |
Y-YM |
(Y-YM)² |
Y' |
Y'-Y M |
(Y'-YM)² |
Y-Y' |
(Y-Y')² |
|
|
|
|
|
|
|
|
|
Sum: |
30 |
0.0 |
17.00 |
30.0 |
0.0 |
16.20 |
0.0 |
0.8 |
The
regression
equation is:
Y' = 1.8X + 1.2
and Y
M = 7.5.
Defining SSY, SSY' and SSE as:
SSY = Σ(Y - Y
M)²
SSY' = Σ(Y' - Y
M)
2
SSE = Σ(Y - Y')²
You can see from the table that SSY = SSY' + SSE which means that the
sum of squares for Y is divided into the sum of squares explained (predicted)
and the sum of squares error. The ratio of SSY'/SSY is the proportion
explained and is equal to r². For this example, r = .976, r²
= .976² = .95. SSY'/SSY = 16.2/17 = .95.
Next section: Confidence intervals and significance tests