this question seems surprising by contrast with the numerous studies of nonnormality with independent scores.
Generalization From Handy Samples. Anova rests on an assumption of randomness, specifically, that the subjects, or observations, are a random sample from some well-defined population. Rarely is this true. Nearly always, subjects are a handy sample from some vaguely defined population. With independent subjects, effective randomness can be achieved by random assignment of subjects to treatments (Section 3.3.2, page 69). But this is not possible with repeated measures. Hence we lack statistical warrant for claiming a real effect even in our handy sample.
In my opinion, extrastatistical inference is the foundation for repeated measures design. The essential assumption is that the treatment means and the error term for our handy sample are similar to what we would get in a population to which we wish to generalize. This assumption rests entirely on extrastatistical judgment. Qualitatively, the repeated measures error term makes sense for such extrastatistical generalization, as indicated in Subject-Treatment Residuals as Error. Statistical theory goes further to provide quantified results based on an idealized assumption of random sampling. Empirically, however, our inferences depend on extrastatistical judgments of similarity between our empirical situation and the idealized statistical situation. a