Behavior Research Methods

, Volume 51, Issue 1, pp 440–450 | Cite as

Value of sample size for computation of the Bayesian information criterion (BIC) in multilevel modeling

  • Julie LorahEmail author
  • Andrew Womack


The Bayesian information criterion (BIC) can be useful for model selection within multilevel-modeling studies. However, the formula for the BIC requires a value for sample size, which is unclear in multilevel models, since sample size is observed for at least two levels. In the present study, we used simulated data to evaluate the rate of false positives and the power when the level 1 sample size, the effective sample size, and the level 2 sample size were used as the sample size value, under various levels of sample size and intraclass correlation coefficient values. The results indicated that the appropriate value for sample size depends on the model and test being conducted. On the basis of the scenarios investigated, we recommend using a BIC that has different penalty terms for each level of the model, based on the number of fixed effects at each level and the level-based sample sizes.


Multilevel modeling Bayesian information criterion BIC Monte Carlo study Model comparison Hierarchical linear modeling 



  1. Atenafu, E. G., Hamid, J. S., To, T., Willan, A. R., Felman, B. M., & Beyene, J. (2012). Bias-corrected estimator for intraclass correlation coefficient in the balanced one-way random effects model. BCM Medical Research Methodology, 12, 1–8.CrossRefGoogle Scholar
  2. Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. CrossRefGoogle Scholar
  3. Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45, 1304–1312. CrossRefGoogle Scholar
  4. Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159. CrossRefPubMedGoogle Scholar
  5. Delattre, M., Lavielle, M., & Poursat, M. A. (2014). A note on BIC in mixed-effects models. Electronic Journal of Statistics, 8, 456–475. CrossRefGoogle Scholar
  6. Hedges, L. V., & Hedberg, E. C. (2007). Intraclass correlation values for planning group-randomized trials in education. Educational Evaluation and Policy Analysis, 29, 60–87.CrossRefGoogle Scholar
  7. Hoeting, J. A., Madigan, D., Raftery, A. E., & Volinsky, C. T. (1999). Bayesian model averaging: A tutorial. Statistical Science, 14, 382–417.CrossRefGoogle Scholar
  8. Hox, J. J. (2010). Multilevel analysis: Techniques and applications. New York, NY: Routledge.CrossRefGoogle Scholar
  9. Jones, R. H. (2011). Bayesian information criterion for longitudinal and clustered data. Statistics in Medicine, 30, 3050–3056. CrossRefPubMedGoogle Scholar
  10. Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795. CrossRefGoogle Scholar
  11. Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90, 928–934.CrossRefGoogle Scholar
  12. Kish, L. (1965). Survey sampling. New York, NY: Wiley.Google Scholar
  13. Lorah, J. A. (2018). Estimating individual-level interaction effects in multilevel models: A Monte Carlo simulation study with application. Journal of Applied Statistics, 45, 2238–2255. CrossRefGoogle Scholar
  14. Lorah, J. A., Sanders, E. A., & Morrison, S. J. (2014). The relationship between English language learner status and music ensemble participation. Journal of Research in Music Education, 62, 234–244. CrossRefGoogle Scholar
  15. Maas, C. J. M., & Hox, J. J. (2004). Robustness issues in multilevel regression analysis. Statistica Neerlandica, 58, 127–137.CrossRefGoogle Scholar
  16. McCoach, D. B., & Black, A. C. (2008). Evaluation of model fit and adequacy. In A. A. O’Connell, & D. B. McCoach (Eds.), Multilevel modeling of educational data (pp. 245–272). Charlotte, NC: Information Age.Google Scholar
  17. R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from
  18. Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163, disc. 165–195. CrossRefGoogle Scholar
  19. Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464. CrossRefGoogle Scholar
  20. Scott, J. G., & Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Annals of Statistics, 38, 2587–2619. CrossRefGoogle Scholar
  21. Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Los Angeles, CA: Sage.Google Scholar
  22. Spybrook, J. (2008). Power, sample size, and design. In A. A. O’Connell & D. B. McCoach (Eds.), Multilevel modeling of educational data (pp. 273–311). Charlotte, NC: Information Age.Google Scholar
  23. Weakliem, D. L. (2004). Introduction to the special issue on model selection. Sociological Methods and Research, 33, 167–187.CrossRefGoogle Scholar
  24. Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model identification and regression estimation. Biometrika, 92, 937–950.CrossRefGoogle Scholar

Copyright information

© The Psychonomic Society, Inc. 2019

Authors and Affiliations

  1. 1.Department of Counseling and Educational PsychologyIndiana University School of EducationBloomingtonUSA
  2. 2.Department of StatisticsIndiana UniversityBloomingtonUSA

Personalised recommendations