# Homoscedastic Balanced Two-fold Nested Model when the Number of Sub-classes is Large

Analysis of variance (ANOVA) is a corner stone of statistical applications. The classical ANOVA model assumes that the error terms are i.i.d. normal, in which case *F*-statistics have certain optimality properties (cf. Arnold (1981, Chapter 7)). Arnold (1980) showed that the classical *F*-test is robust to the normality if the sample sizes tend to infinity while the number of levels stays fixed. The past decade has witnessed the generation of large data sets, involving a multitude of factor levels, in several areas of scientific investigation. For example, in agricultural trials it is not uncommon to see a large number of treatments but limited replication per treatment. See Brownie and Boos (1994) and Wang and Akritas (2006). Another application arises in certain type of microarray data in which the nested factor corresponds to a large number of genes. In addition to the aforementioned papers by Brownie and Boos and Wang and Akritas, other relevant literature includes Akritas and Arnold (2000), Bathke (2002) and Akritas and Papadatos (2004).

The above papers deal only with crossed designs. In this article we consider the two-fold nested design and establish the asymptotic theory, both under the null and alternative hypotheses, for the usual F-test statistics of sub-class effects when the number of sub-classes goes to infinity but the number of classes and the number of observations in each sub-class remain fixed. The fixed, random and mixed effects models are all considered. The main finding of the paper is that the classical, normality-based, test procedure is asymptotically robust to departures from the normality assumption.

The rest of this manuscript is organized as follows. The next section contains a review of the statistical models, and states three results that are useful for the asymptotic derivations. In Sections 3 we present the asymptotic theory for the fixed-effects model, while Section 4 presents the asymptotic theory for both the random and the mixed-effects model. Some simulation results are shown in Section 5, and finally Section 6 states conclusions.

## Keywords

Asymptotic Distribution Asymptotic Theory Null Distribution Corner Stone Asymptotic Null Distribution## References

- Akritas MG, Papadatos N (2004) Heteroscedastic One-Way ANOVA and Lack-of-Fit tests. Journal of the American Statistical Association 99:368-382MATHCrossRefMathSciNetGoogle Scholar
- Akritas MG, Arnold SF (2000) Asymptotics for Analysis of Variance when the Number of Levels is Large. Journal of the American Statistical Association 95:212-226MATHCrossRefMathSciNetGoogle Scholar
- Arnold SF (1980) Asymptotic Validity of F Tests for the Ordinary Linear Model and the Multiple Correlation Model. Journal of the American Statistical Association 75:890-894MATHCrossRefMathSciNetGoogle Scholar
- Arnold SF (1981) The Theory of Linear Models and Multivariate Analysis. Wiley, New YorkMATHGoogle Scholar
- Bathke A (2002) ANOVA for a Large Number of Treatments. Mathematical Methods of Statistics 11:118-132MATHMathSciNetGoogle Scholar
- Brownie C, Boos DD (1994) Type I Error Robustness of ANOVA and ANOVA on Ranks when the Number of Treatments is Large. Biometrics 50:542-549MATHCrossRefGoogle Scholar
- Wang L and Akritas MG (2006) Two-way Heteroscedastic ANOVA when the Number of Levels is Large. Statistica Sinica 16:1387-1408MATHMathSciNetGoogle Scholar