# Value of sample size for computation of the Bayesian information criterion (BIC) in multilevel modeling

## Abstract

The Bayesian information criterion (BIC) can be useful for model selection within multilevel-modeling studies. However, the formula for the BIC requires a value for sample size, which is unclear in multilevel models, since sample size is observed for at least two levels. In the present study, we used simulated data to evaluate the rate of false positives and the power when the level 1 sample size, the effective sample size, and the level 2 sample size were used as the sample size value, under various levels of sample size and intraclass correlation coefficient values. The results indicated that the appropriate value for sample size depends on the model and test being conducted. On the basis of the scenarios investigated, we recommend using a BIC that has different penalty terms for each level of the model, based on the number of fixed effects at each level and the level-based sample sizes.

## Keywords

Multilevel modeling Bayesian information criterion BIC Monte Carlo study Model comparison Hierarchical linear modeling## Notes

## References

- Atenafu, E. G., Hamid, J. S., To, T., Willan, A. R., Felman, B. M., & Beyene, J. (2012). Bias-corrected estimator for intraclass correlation coefficient in the balanced one-way random effects model.
*BCM Medical Research Methodology*,*12*, 1–8.CrossRefGoogle Scholar - Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4.
*Journal of Statistical Software*,*67*, 1–48. https://doi.org/10.18637/jss.v067.i01 CrossRefGoogle Scholar - Cohen, J. (1990). Things I have learned (so far).
*American Psychologist*,*45*, 1304–1312. https://doi.org/10.1037/0003-066X.45.12.1304 CrossRefGoogle Scholar - Cohen, J. (1992). A power primer.
*Psychological Bulletin*,*112*, 155–159. https://doi.org/10.1037/0033-2909.112.1.155 CrossRefPubMedGoogle Scholar - Delattre, M., Lavielle, M., & Poursat, M. A. (2014). A note on BIC in mixed-effects models.
*Electronic Journal of Statistics*,*8*, 456–475. https://doi.org/10.1214/14-EJS890 CrossRefGoogle Scholar - Hedges, L. V., & Hedberg, E. C. (2007). Intraclass correlation values for planning group-randomized trials in education.
*Educational Evaluation and Policy Analysis*,*29*, 60–87.CrossRefGoogle Scholar - Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial.
*Statistical Science*,*14*, 382–417.CrossRefGoogle Scholar - Hox, J. J. (2010).
*Multilevel analysis: Techniques and applications*. New York, NY: Routledge.CrossRefGoogle Scholar - Jones, R. H. (2011). Bayesian information criterion for longitudinal and clustered data.
*Statistics in Medicine*,*30*, 3050–3056. https://doi.org/10.1002/sim.4323 CrossRefPubMedGoogle Scholar - Kass, R. E., & Raftery, A. E. (1995). Bayes factors.
*Journal of the American Statistical Association*,*90*, 773–795. https://doi.org/10.1080/01621459.1995.10476572 CrossRefGoogle Scholar - Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion.
*Journal of the American Statistical Association*,*90*, 928–934.CrossRefGoogle Scholar - Kish, L. (1965).
*Survey sampling*. New York, NY: Wiley.Google Scholar - Lorah, J. A. (2018). Estimating individual-level interaction effects in multilevel models: A Monte Carlo simulation study with application.
*Journal of Applied Statistics*,*45*, 2238–2255. https://doi.org/10.1080/02664763.2017.1414163 CrossRefGoogle Scholar - Lorah, J. A., Sanders, E. A., & Morrison, S. J. (2014). The relationship between English language learner status and music ensemble participation.
*Journal of Research in Music Education*,*62*, 234–244. https://doi.org/10.1177/0022429414542301 CrossRefGoogle Scholar - Maas, C. J. M., & Hox, J. J. (2004). Robustness issues in multilevel regression analysis
*. Statistica Neerlandica*,*58*, 127–137.CrossRefGoogle Scholar - McCoach, D. B., & Black, A. C. (2008). Evaluation of model fit and adequacy. In A. A. O’Connell, & D. B. McCoach (Eds.),
*Multilevel modeling of educational data*(pp. 245–272). Charlotte, NC: Information Age.Google Scholar - R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org/
- Raftery, A. E. (1995). Bayesian model selection in social research.
*Sociological Methodology*,*25*, 111–163, disc. 165–195. https://doi.org/10.2307/271063 CrossRefGoogle Scholar - Schwarz, G. (1978). Estimating the dimension of a model.
*Annals of Statistics*,*6*, 461–464. https://doi.org/10.1214/aos/1176344136 CrossRefGoogle Scholar - Scott, J. G., & Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem.
*Annals of Statistics*,*38*, 2587–2619. https://doi.org/10.1214/10-AOS792 CrossRefGoogle Scholar - Snijders, T. A. B., & Bosker, R. J. (2012).
*Multilevel analysis: An introduction to basic and advanced multilevel modeling*. Los Angeles, CA: Sage.Google Scholar - Spybrook, J. (2008). Power, sample size, and design. In A. A. O’Connell & D. B. McCoach (Eds.),
*Multilevel modeling of educational data*(pp. 273–311). Charlotte, NC: Information Age.Google Scholar - Weakliem, D. L. (2004). Introduction to the special issue on model selection.
*Sociological Methods and Research*,*33*, 167–187.CrossRefGoogle Scholar - Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model identification and regression estimation.
*Biometrika*,*92*, 937–950.CrossRefGoogle Scholar