Homoscedastic Balanced Two-fold Nested Model when the Number of Sub-classes is Large
Analysis of variance (ANOVA) is a corner stone of statistical applications. The classical ANOVA model assumes that the error terms are i.i.d. normal, in which case F-statistics have certain optimality properties (cf. Arnold (1981, Chapter 7)). Arnold (1980) showed that the classical F-test is robust to the normality if the sample sizes tend to infinity while the number of levels stays fixed. The past decade has witnessed the generation of large data sets, involving a multitude of factor levels, in several areas of scientific investigation. For example, in agricultural trials it is not uncommon to see a large number of treatments but limited replication per treatment. See Brownie and Boos (1994) and Wang and Akritas (2006). Another application arises in certain type of microarray data in which the nested factor corresponds to a large number of genes. In addition to the aforementioned papers by Brownie and Boos and Wang and Akritas, other relevant literature includes Akritas and Arnold (2000), Bathke (2002) and Akritas and Papadatos (2004).
The above papers deal only with crossed designs. In this article we consider the two-fold nested design and establish the asymptotic theory, both under the null and alternative hypotheses, for the usual F-test statistics of sub-class effects when the number of sub-classes goes to infinity but the number of classes and the number of observations in each sub-class remain fixed. The fixed, random and mixed effects models are all considered. The main finding of the paper is that the classical, normality-based, test procedure is asymptotically robust to departures from the normality assumption.
The rest of this manuscript is organized as follows. The next section contains a review of the statistical models, and states three results that are useful for the asymptotic derivations. In Sections 3 we present the asymptotic theory for the fixed-effects model, while Section 4 presents the asymptotic theory for both the random and the mixed-effects model. Some simulation results are shown in Section 5, and finally Section 6 states conclusions.