Next chapter: Testing hypotheses with standard errors

Taken together, these two experiments provide relatively good evidence (p<0.05) that sleep deprivation increases reaction time. Naturally, the more times an outcome is replicated, the more believable the outcome. Assume the experimenter did the experiment six times and the sleep-deprived group was slower each time. If the probability values were: 0.10, 0.08, 0.12, 0.07, 0.19, and 0.13, the combined probability would be 0.009. Therefore, six non-significant probabilities combine to produce one highly significant probability. Compare this with the naive view that would state that the experimenter's hypothesis is almost certainly incorrect since not one of the six experiments found a significant difference between sleep-deprived and control subjects. Such a view implicitly accepts the null hypothesis, a serious error.

In summary, a non-significant result means that the data are inconclusive. Collecting additional data may be all that is needed to reject the null hypothesis. If the null hypothesis is true, then additional data will make clear that the effect is at most small. The additional data can never prove that the effect is nonexistent.

Next chapter: Testing hypotheses with standard errors