In multiple regression, the multiple correlation
coefficient (R) tends to overestimate the population
value of R. Therefore, R is a biased estimator
of the population parameter. Since R has a positive
bias, the application of the regression equation derived in one sample to
an independent sample almost always yields a
lower correlation between predicted and observed scores
(which is what R is) in the new sample than in the original sample.
This reduction in R is called shrinkage. The amount of shrinkage is affected
by the sample size and the number of predictor variables: The larger the
sample size and the fewer predictors, the lower the shrinkage. The following
formula can be used to estimate the value of R² in the population: