Shrinkage




In multiple regression, the multiple correlation coefficient (R) tends to overestimate the population value of R. Therefore, R is a biased estimator of the population parameter. Since R has a positive bias, the application of the regression equation derived in one sample to an independent sample almost always yields a lower correlation between predicted and observed scores (which is what R is) in the new sample than in the original sample. This reduction in R is called shrinkage. The amount of shrinkage is affected by the sample size and the number of predictor variables: The larger the sample size and the fewer predictors, the lower the shrinkage. The following formula can be used to estimate the value of R² in the population: