Blind Guides for Explorers: P Values, Subgroups and Data Dredging

  • Lemuel A. Moyé


Research investigators are trained to be thorough, to find everything worth findings in their data. Having invested great time and effort in their studies, these scientists want and need to examine the data systematically and completely. In clinical experiments where there are multiple endpoints, investigators will examine the effect of therapy on each one, with the anticipated reward of identifying a therapy effect for the experiment’s primary endpoint. However, investigators are aware that there may be unanticipated findings in other analyses, that, unlike the primary endpoint analysis, are nonprospectively stated evaluations. Investigators believe that, like unsuspected pots of gold, these tantalizing surprises might lie just under the surface, hidden from view, waiting to be found. If the experiment demonstrated that an intervention reduces the incidence of heart attacks, then maybe there is a hidden relationship between marital status and heart attacks. Perhaps there is an unanticipated relationship between the patient’s astrologic sign and the occurrence of a heart attack? Sometimes it is not the investigator who is raising these questions. In the process of publication, reviewers of the manuscript will sometimes ask that additional analyses be carried out. These analyses can include considering the effect of the intervention in subsets of the data. Does the therapy work equally well in women and men? Does it work equally in different racial groups? What about in patients with a previous heart attack? These analyses are demanded by others, but are also not prospectively stated.


Heart Attack Confirmatory Analysis Strip Mining Subgroup Finding Prevent Heart Attack Trial 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mills, J.L. (1993) “Data torturing,” New Eng JMed. 329: 1196–1199.CrossRefGoogle Scholar
  2. 2.
    The SHEP Cooperative Research Group. (1988) “Rationale and design of a randomized clinical trial on prevention of stroke in isolated systolic hypertension,” Journal of Clinical Epidemiology 41: 1197–1208.CrossRefGoogle Scholar
  3. 3.
    Davis, B.R., Cutler, J.A., Gordon, D.J., Furberg, C.D., Wright, J.T., Cushman,C., Grimm, R.H., LaRosa, J., Whelton, P.K., Perry, H.M., Alderman, M.H., Ford, C.E., Oparil, S., Francis, C., Proscham, M., Pressel, S., Black, H.R., and Hawkins, C.M., for the ALLHAT Research Group. (1996) “Rationale and design for the antihypertensive and lipid lowering treatment to prevent heart attack trial,” American Journal of Hypertension 9: 342–360.PubMedCrossRefGoogle Scholar
  4. 4.
    Yusef, S., Wittes, J., Probstfield, J., and Tyroler, H.A., (1991) “Analysis and Interpretation of Treament Effects in Subgroups of Patients in Randomized Clinical Trials,” Journal of the American Medical Association 2266: 93–98.Google Scholar
  5. 5.
    Sacks, F.M., Pfeffer, M.A., Moyé, L.A., Rouleau, J.L., Rutherford, J.D., Cole, T.G., Brown, L., Warnica, J.W., Arnold, J.M., Wun, C.C., Davis, B.R., and Braunwald, E., for the Cholesterol and recurrent Events Trial Investigators (1996) “The effect of pravastatin on coronary events after myocardial infarction in patients with average cholesterol levels,” N Engl J Med 335: 1001–1009.PubMedCrossRefGoogle Scholar
  6. 6.
    Lewis, S.J., Sacks, F.M., Mitchell, J.S., East, C., Glasser, S., Kell, S., Letterer, R., Limacher, M., Moyé, L.A., Rouleau, J.L., Pfeffer, M.A., and Braunwald, E., (1998) “Effect of pravastatin on cardiovascular events in women after myocardial infarction: the cholesterol and recurrent events (CARE) trial,” Journal of the American College of Cardiology 32: 140–146.PubMedCrossRefGoogle Scholar
  7. 7.
    Sacks, F.M., Moyé, L.A., Davis, B.R., Cole, T.G., Rouleau, J.L., Nash, D.T., Pfeffer, M.A., and Braunwald, E. (1998) “Relationship between plasma LDL concentrations during treatment with pravastatin and recurrent coronary events in the Cholesterol and Recurrent Events trial,” Circulation 97: 1446–1452.PubMedCrossRefGoogle Scholar
  8. 8.
    Cohen, A.J. “Replication” Epidemiology 8: 341–343.Google Scholar
  9. 9.
    Beaglehole, R., Bonita, R. and Kjellström, T., (1993) Causation in epidemiology, Geneva: World Health Organization, pp. 71–81.Google Scholar
  10. 10.
    The Long-Term Intervention with Pravastatin in Ischaemic Disease (LIPID) Study Group. (1998) “Prevention of cardiovascular events and death with pravastatin in patients with coronary heart disease with a broad range of initial cholesterol levels,” New Eng j Med 339: 1349–57.CrossRefGoogle Scholar
  11. 11.
    Moyé, L.A., for the SAVE Cooperative Group (1991) “Rationale and Design of a Trial to Assess Patient Survival and Ventricular Enlargement after Myocardial Infarction,” Journal of the American College of Cardiology, 68: 70D - 79D.CrossRefGoogle Scholar
  12. 12.
    Pfeffer, M.A., Brauwald, E., and Moyé L.A., (1992) “Effect of Captopril on mortality and morbidity in patients with left ventricular dysfunction after myocardial infarction—results of the Survival and Ventricular Enlargement Trial,” New England Journal of Medicine 327: 669–677.PubMedCrossRefGoogle Scholar
  13. 13.
    Moyé, L.A., Pfeffer, M.A., Wun, C.C., and Davis, B.R., (1994) “Uniformity of Captopril Benefit in the Post Infarction Population: Subgroup Analysis in SAVE,” European Heart Journal 15: 2–8.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2000

Authors and Affiliations

  • Lemuel A. Moyé
    • 1
  1. 1.School of Public HealthUniversity of TexasHoustonUSA

Personalised recommendations