Skip to main content

Advertisement

Log in

How to guarantee finding a statistically significant difference: the use and abuse of subgroup analyses

  • Published:
Quality of Life Research Aims and scope Submit manuscript

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig 1

References

  1. ISIS-2 (Second International Study of Infarct Survival) Collaborative Group. (1988). Randomised trial of intravenous streptokinase, oral aspirin, both, or neither among 17,187 cases of suspected acute myocardial infarction: ISIS-2. Lancet, ii, 349–360.

    Google Scholar 

  2. Bland, J. M., & Altman, D. G. (1995). Multiple significance tests: The Bonferroni method. British Medical Journal, 310, 170.

    PubMed  CAS  Google Scholar 

  3. Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800–802. doi:10.1093/biomet/75.4.800.

    Article  Google Scholar 

  4. Sankoh, A. J., Huque, M. F., & Dubey, S. D. (1997). Some comments on frequently used multiple endpoint adjustment methods in clinical trials. Statistics in Medicine, 16, 2529–2542. doi:10.1002/(SICI)1097-0258(19971130)16:22<2529::AID-SIM692>3.0.CO;2-J.

    Article  PubMed  CAS  Google Scholar 

  5. Assmann, S. F., Pocock, S. J., Enos, L. E., & Kasten, L. E. (2000). Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet, 355, 1064–1069. doi:10.1016/S0140-6736(00)02039-0.

    Article  PubMed  CAS  Google Scholar 

  6. Brookes, S. T., Whitely, E., Egger, M., Smith, G. D., Mulheran, P. A., & Peters, T. J. (2004). Subgroup analyses in randomized trials: Risks of subgroup-specific analyses; power and sample size for the interaction test. Journal of Clinical Epidemiology, 57, 229–236. doi:10.1016/j.jclinepi.2003.08.009.

    Article  PubMed  Google Scholar 

  7. Horton, R. (2000). Commentary: From star signs to trial guidelines. Lancet, 355, 1033–1034. doi:10.1016/S0140-6736(00)02031-6.

    Article  PubMed  CAS  Google Scholar 

  8. Peto, R. (1990). Misleading subgroup analyses in GISSI. The American Journal of Cardiology, 66, 771–772. doi:10.1016/0002-9149(90)91149-Z.

    Article  PubMed  CAS  Google Scholar 

  9. Yusuf, S., Wittes, J., Probstfield, J., & Tyroler, H. A. (1991). Analysis and interpretation of treatment effects in subgroups of patients in randomised clinical trials. Journal of the American Medical Association, 266, 93–98. doi:10.1001/jama.266.1.93.

    Article  PubMed  CAS  Google Scholar 

  10. Brookes, S. T., Whitley, E., Peters, T. J., Mulheran, P. A., Egger, M., & Davey Smith, G. (2001). Subgroup analyses in randomised controlled trials: Quantifying the risks of false-positives and false-negatives. Health Technology Assessment, 5(33), 1–56. From http://www.hta.nhs.uk/fullmono/mon533.pdf. Accessed 12 March 2009.

  11. Grouin, J.-M., Coste, M., & Lewis, J. (2005). Subgroup analyses in randomized clinical trials: Statistical and regulatory issues. Journal of Biopharmaceutical Statistics, 15, 869–882. doi:10.1081/BIP-200067988.

    Article  PubMed  Google Scholar 

  12. Wang, R., Lagakos, S. W., Ware, J. H., Hunter, D. J., & Drazen, J. M. (2007). Statistics in medicine—reporting of subgroup analyses in clinical trials. The New England Journal of Medicine, 357, 2189–2194. doi:10.1056/NEJMsr077003.

    Article  PubMed  CAS  Google Scholar 

  13. Martin, V., Cady, R., Mauskop, A., Seidman, L. S., Rodgers, A., Hustard, C. M., et al. (2008). Efficacy of rizatriptan for menstrual migraine in an early intervention model: A prospective subgroup analysis of the rizatriptan TAME (Treat a Migraine Early) studies. Headache, 48, 226–235.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter M. Fayers.

Appendix

Appendix

Type 1 error for independent hypothesis tests

If k independent hypothesis tests are carried out, each with a significance level (P value) of α 0, the overall probability of a type 1 error (false positive) is α = 1 − (1 − α 0)k. Thus the risk of a false positive result rapidly increases as k increases. Suppose a factor used for subgroup analyses has two response levels, thus dividing the data into two subgroups. If this factor is unrelated to outcome, each of the two portions of the data is equivalent to an independent random sample. Thus the probability that at least one of these subgroups is falsely significant, P < 0.05, is 1 − (1 − 0.05)2 = 1 − 0.952, which equals 0.0975. This is a value that is nearly double the nominal 0.05.

Type 1 error for multiple subgroup analyses—simulations

Suppose m factors are used for subgroup analyses, each dividing the data into two approximately equal halves. Also, assume that these factors are independent of each other. For example, one factor might be gender, and another factor might be age grouping defined as above or below the median age. Although these factors are independent, the subgroups formed by them will include overlapping subjects. For example, roughly half of the female respondents will also be included in the young age group. Therefore, even though the factors are independent, the resultant P values will be correlated. This makes analytical solutions more difficult.

A simple way to estimate the type 1 error is to use computer simulations. We assumed that there was, in truth, no treatment effect in any of the subgroups, that the outcome of interest followed a normal distribution, and that a t-test would be applied. Binary factors were applied, effectively dichotomising the data into two separate halves. Sample sizes of 50, 100, 200, 300, 400 and 500 random normally distributed observations were generated. Each of these simulations was repeated 40,000 times, and the proportion of studies in which at least one subgroup had a P value that exceeded P < 0.05 was counted. As might be anticipated, sample size did not affect these proportions (sample size should only affect the type 2 error).

The results of the simulation are summarised in Fig. 1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fayers, P.M., King, M.T. How to guarantee finding a statistically significant difference: the use and abuse of subgroup analyses. Qual Life Res 18, 527–530 (2009). https://doi.org/10.1007/s11136-009-9473-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11136-009-9473-3

Keywords

Navigation