Testing homogeneity of proportions from sparse binomial data with a large number of groups

  • Junyong ParkEmail author


In this paper, we consider testing the homogeneity for proportions in independent binomial distributions, especially when data are sparse for large number of groups. We provide broad aspects of our proposed tests such as theoretical studies, simulations and real data application. We present the asymptotic null distributions and asymptotic powers for our proposed tests and compare their performance with existing tests. Our simulation studies show that none of tests dominate the others; however, our proposed test and a few tests are expected to control given sizes and obtain significant powers. We also present a real example regarding safety concerns associated with Avandia (rosiglitazone) in Nissen and Wolski (New Engl J Med 356:2457–2471, 2007).


Asymptotic distribution Homogeneity of proportions Sparse data 


  1. Bathke, A. C., Harrar, S. W. (2008). Nonparametric methods in multivariate factorial designs for large number of factor levels. Journal of Statistical Planning and Inference, 138(3), 588–610.Google Scholar
  2. Bathke, A., Lankowski, D. (2005). Rank procedures for a large number of treatments. Journal of Statistical Planning and Inference, 133(2), 223–238.Google Scholar
  3. Billingsley, P. (1995). Probability and Measure (3rd ed.). Hoboken: Wiley.Google Scholar
  4. Boos, D. D., Brownie, C. (1995). ANOVA and rank tests when the number of treatments is large. Statistics and Probability Letters, 23, 183–191.Google Scholar
  5. Cai, T., Parast, L., Ryan, L. (2010). Meta-analysis for rare events. Statistics in Medicine, 29(20), 2078–2089.Google Scholar
  6. Cochran, W. G. (1954). Some methods for strengthening the common \(\chi ^2\) tests. Biometrics, 10, 417–451.Google Scholar
  7. Greenshtein, E., Ritov, R. (2004). Persistence in high-dimensional linear predictor selection and the virtue of over parametrization. Bernoulli, 10, 971–988.Google Scholar
  8. Nissen, S. E., Wolski, K. (2007). Effect of rosiglitazone on the risk of myocardial infarction and death from cardiovascular causes. New England Journal of Medicine, 356(24), 2457–2471.Google Scholar
  9. Park, J. (2009). Independent rule in classification of multivariate binary data. Journal of Multivariate Analysis, 100, 2270–2286.Google Scholar
  10. Park, J., Ghosh, J. K. (2007). Persistence of the plug-in rule in classification of high dimensional multivariate binary data. Journal of Statistical Planning and Inference, 137, 3687–3705.Google Scholar
  11. Potthoff, R. F., Whittinghill, M. (1966). Testing for homogeneity: I. The binomial and multinomial distributions. Biometrika, 53, 167–182.Google Scholar
  12. Shuster, J. J. (2010). Empirical versus natural weighting in random effects meta analysis. Statistics in Medicine, 29, 1259–1265.Google Scholar
  13. Shuster, J. J., Jones, L. S., Salmon, D. A. (2007). Fixed vs random effects meta-analysis in rare event studies: The rosiglitazone link with myocardial infarction and cardiac death. Statistics in Medicine, 26, 4375–4385.Google Scholar
  14. Stijnen, T., Hamza, Taye H., Zdemir, P. (2010). Random effects meta-analysis of event outcome in the framework of the generalized linear mixed model with applications in sparse data. Statistics in Medicine, 29, 3046–3067.Google Scholar
  15. Tian, L., Cai, T., Pfeffer, M. A., Piankov, N., Cremieux, P. Y., Wei, L. J. (2009). Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent \(2\times 2\) tables with all available data but without artificial continuity correction. Biostatistics, 10, 275–281.Google Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2018

Authors and Affiliations

  1. 1.Department of Mathematics and StatisticsUniversity of Maryland Baltimore CountyBaltimoreUSA

Personalised recommendations