Delta inflation: a bias in the design of randomized controlled trials in critical care medicine
 11k Downloads
 30 Citations
Abstract
Introduction
Mortality is the most widely accepted outcome measure in randomized controlled trials of therapies for critically ill adults, but most of these trials fail to show a statistically significant mortality benefit. The reasons for this are unknown.
Methods
We searched five high impact journals (Annals of Internal Medicine, British Medical Journal, JAMA, The Lancet, New England Journal of Medicine) for randomized controlled trials comparing mortality of therapies for critically ill adults over a ten year period. We abstracted data on the statistical design and results of these trials to compare the predicted delta (delta; the effect size of the therapy compared to control expressed as an absolute mortality reduction) to the observed delta to determine if there is a systematic overestimation of predicted delta that might explain the high prevalence of negative results in these trials.
Results
We found 38 trials meeting our inclusion criteria. Only 5/38 (13.2%) of the trials provided justification for the predicted delta. The mean predicted delta among the 38 trials was 10.1% and the mean observed delta was 1.4% (P < 0.0001), resulting in a deltagap of 8.7%. In only 2/38 (5.3%) of the trials did the observed delta exceed the predicted delta and only 7/38 (18.4%) of the trials demonstrated statistically significant results in the hypothesized direction; these trials had smaller deltagaps than the remainder of the trials (deltagap 0.9% versus 10.5%; P < 0.0001). For trials showing nonsignificant trends toward benefit greater than 3%, large increases in sample size (380%  1100%) would be required if repeat trials use the observed delta from the index trial as the predicted delta for a followup study.
Conclusions
Investigators of therapies for critical illness systematically overestimate treatment effect size (delta) during the design of randomized controlled trials. This bias, which we refer to as "delta inflation", is a potential reason that these trials have a high rate of negative results.
"Absence of evidence is not evidence of absence."
Keywords
Critical Care Critical Illness Sample Size Calculation Minimal Clinically Important Difference Treatment Effect SizeAbbreviations
 delta
effect size
 MCID
minimal clinically important difference.
Introduction
Mortality has become the standard outcome measure in trials of therapies in critically ill adults because it obviates debate about clinical relevance and concerns of ascertainment bias. However, it has recently been noted that the majority of these trials fail to demonstrate efficacy [1] and several therapies that appeared promising did not demonstrate efficacy on repeated study [2, 3, 4, 5, 6, 7]. The high rate of negative results in these trials could be explained by several possibilities including true lack of efficacy (the null hypothesis is true), type II statistical errors in trials with adequate power, and methodological problems in study design leading to inadequate power and sample size [8].
Several parameters must be chosen by investigators in the design of a trial of mortality in order to determine the required sample size, including the significance level required for rejection of the null hypothesis; power; the predicted mortality rate in the placebo arm; and the predicted effect size (delta). In contrast to significance level and power, which are usually set by convention at 0.05 and 90%, respectively, predictions about the placebo mortality rate must be guided by preliminary data (if available) or guesswork. Likewise, predictions of delta are either based on existing data or are guided by biological plausibility or a minimal clinically important difference (MCID) [9, 10]. Using these four variables (significance level, power, baseline mortality rate, and delta) sample size required for the trial can be calculated.
Simulated scenarios for sample size determination in the design of a hypothetical study
Standard Scenario  Relaxed significance level  Relaxed Power  Baseline Mortality shifted away from 50%  Inflated delta  

Significance level (twosided)  0.05  0.1  0.05  0.05  0.05 
Power  90%  90%  80%  90%  90% 
Baseline (placebo) mortality rate  50%  50%  50%  40%  50% 
Delta (ARR)  10%  10%  10%  10%  15% 
Required sample size  1076  884  816  992  480 
Materials and methods
One author (SKA) performed a search of the tables of contents of five highimpact medical journals (BMJ, New England Journal of Medicine, Journal of the American Medical Association, Lancet, Annals of Internal Medicine) for titles containing the keywords (and variations thereof) critically ill, intensive care, ICU, acute respiratory distress syndrome, acute lung injury, sepsis, shock, ventilator, ventilation, respiratory failure, multiple organ dysfunction, continuous venovenous hemodialysis, and renal failure, but not containing keywords related to pediatrics (neonatal, infant, children, prematurity) published between 1 January, 1999 and 22 July, 2009. Articles containing included keywords were then reviewed further to determine if they met inclusion and exclusion criteria. Articles were included if they described a randomized controlled trial in a critically ill adult population that evaluated proportional mortality (mortality expressed as a proportion as opposed to that measured as a mean survival or a time to event analysis) as the primary endpoint upon which power calculations were based. Articles were excluded if they described a noninferiority trial, if they dealt with a nonICU population (out of hospital, prehospital, or care not described as delivered in an ICU setting), and if they included nonadult patients. Factorial trials testing more than one therapy were considered as separate trials for each therapy tested, even if reported in the same manuscript.
Data were abstracted from articles meeting these criteria utilizing a standardized form. We recorded variables pertaining to statistical methods including significance level, power, delta, the expected baseline (placebo or standard care) mortality rate, the a priori sample size, whether the study was terminated early, and any modifications made to the sample size in the middle of the trial. We recorded whether the predicted delta was justified by reference to either published or unpublished data. We abstracted data from the results of the trial including the number of patients in the treatment and placebo arms that were included in the final data analysis, and the mortality rate in each arm. We recorded unadjusted results and those pertaining to the overall (intentiontotreat) population (so that the results would correspond to the assumptions of the power calculations) even where the authors emphasized adjusted or subgroup analyses. For three trials that did not report the predicted delta, we contacted the authors to obtain this information. For one of these trials [17], the predicted delta could not be determined and the study was excluded. For the other two trials, the authors provided information about the predicted delta and sample size calculations not included in the original manuscript.
Using these data, we performed confirmatory sample size calculations for each trial, determined the observed treatment effect (delta) and the difference between the predicted and observed delta (the deltagap), calculated the 95% confidence interval for the observed delta, and plotted a graph of observed versus predicted delta. We calculated mean predicted and observed delta values across all trials, and compared them using a paired ttest with unequal variances. For nonstatistically significant trials that had an observed delta greater than the smallest predicted delta of all the trials (3% [18]), we calculated the sample size that would be required if the trials were to be repeated using the observed delta of the index trial as the predicted delta for the future trial. All statistical calculations were performed using STATA version 8.0 (College Station, TX, USA).
Results
Our search identified 160 articles for further review. Of these, 58 described trials that were not randomized controlled trials, 46 were excluded because mortality was not the primary outcome on which power calculations were based, 12 were excluded because they dealt with noncritically ill populations, 2 were excluded because they described noninferiority trials, 1 was excluded because it dealt with pediatric patients, and 1 was excluded because no predicted delta was reported and the authors could not provide the information. The remaining 38 articles were included in our analysis.
Additional file 1 shows the characteristics of the included trials. Among all trials, only 5 of the 38 (13.2%) provided justification for the predicted delta, and 7 of the 38 (18.4%) provided justification for the baseline mortality rate used in sample size calculations (data not shown). Among all included trials, 27 of the 38 (71%) provided sufficient information for us to replicate the sample size calculations. For 20 of these 27 trials (74%), our sample size calculations yielded values that deviated less than 10% from the a priori sample sizes specified in the manuscript.
Among all trials, 17 of 38 (44.7%) had an observed delta with a negative value (that is, the treatment was numerically worse than the comparator). Three of these trials showed a statistically significant increase in mortality with the therapy, and all of these trials were stopped early for harm [4, 21, 22]. The seven trials showing a statistically significant difference favoring the therapy had a smaller deltagap compared with nonsignificant trials and those demonstrating harm (deltagap 0.9% versus 10.5%; P < 0.0001). In Figure 1, these seven trials are represented by red triangles above zero on the Yaxis; as can be seen graphically, the deltas associated with these trials fall closer to the blue unity line than the other trials.
For the eight trials that showed a nonstatistically significant point estimate for delta that exceeded the smallest predicted delta of all trials (3% [18]), we calculated the sample size that would be required to repeat the study using the observed delta of the index study as the predicted delta for the repeat study. Repeating these trials in this fashion would require increases in sample size from 380% to 1,100% compared with the sample size of the index study (data not shown).
Discussion
We found that randomized controlled trials of therapies in critical care medicine evaluating proportional mortality as a primary endpoint and published in five highimpact medical journals during the past 10 years utilized predicted values of delta in power calculations that systematically overestimated observed values of delta. We propose that this phenomenon of 'delta inflation' represents a bias in the design of such trials with attendant implications for the design of future trials and the practice of critical care medicine.
Our results accord with the findings of a recent report that found low rates of efficacy in trials in critical care medicine, a finding the authors attributed to the use of mortality as an endpoint [1]. We extend this work by identifying a key feature of such trials, namely that the predicted delta almost uniformly overestimates the observed delta. This phenomenon of 'delta inflation' is a possible reason that many of these trials fail to demonstrate efficacy. Other investigators have found discrepancies between predicted and observed delta in other fields and with other outcomes, but the overall prevalence of delta inflation in clinical investigation is unknown [23, 24]. Our study also complements reports showing that sample size calculations are inadequately or disingenuously reported in randomized controlled trials [8, 25, 26]. It expands this work by demonstrating that even when there is adequate reporting of statistical methodology, one component of sample size estimation is biased, thus rendering the entire procedure unreliable [9].
The reasons for the discrepancy between predicted and observed delta cannot be determined from our data, but beg speculation. One possibility is that investigators are choosing delta based on sample size rather than choosing sample size based on delta [8, 11]. Another possibility is that investigators are overly optimistic about the efficacy and effect size of a therapy and that delta inflation is borne of unrealistic optimism [27]. There may also be a belief that effect sizes below some threshold (say, 10%) are not clinically important, but this is a notion undermined by investigations that sought predicted delta values as low as 3% and by other evidence [18, 28]. Moreover, although it has been suggested that delta should be based on an assessment of the MCID, our finding of wide variation in predicted deltas in studies with the same primary outcome demonstrates that this is not happening [29, 30, 31]. Publication bias affecting pilot trials may cause those with smaller effect sizes to go unpublished, thereby inflating the apparent benefit of a therapy when and if a literature search is performed [32]; however, the low rate of referenced justification for predicted delta that we and others have documented argues against this [24, 33, 34]. The insistence on mortality as the gold standard outcome measure in critical care research combined with funding constraints may pressure investigators to search for unrealistic mortality benefits and perhaps to hope that significant improvements in secondary outcome measures will lead to adoption of the therapy [35, 36]. Indeed, the very concept of power and the socalled 'doublesignificance' approach to hypothesis testing and sample size determination has been called into question [37]. Finally, a looming possibility is that the null hypothesis is true and most therapies for critical illness simply are not efficacious. Given the wide confidence intervals around observed delta in the trials in our analysis, this is impossible to disprove with existing data. However, the consistent conduct of trials of therapies that are in reality not efficacious basically would consist of an extreme form of delta inflation. In any case, investigators should take stock in the fact that deltas of 10% or greater are rarely found, and attention needs to be refocused on what is the minimal clinically important difference in trials of therapies to reduce mortality in critical illness [9, 31].
Regardless of the causes of delta inflation, its effects are likely deleterious. Firstly, some authors have argued that underpowered trials are unethical and trials designed with delta inflation are essentially underpowered [38]. Secondly, insomuch as delta inflation leads to trials that are 'negative', it may contribute to the premature abandonment of promising therapies because of the commonly held belief that 'absence of evidence is evidence of absence' [39]. This is compounded by the fact that delta inflation can conceal the low statistical power of a trial, thus falsely assuring clinicians that a true difference has been ruled out by a trial with a low type II error rate. Thirdly, the conduct of trials with delta inflation may represent a waste of resources because it undermines their scientific and clinical validity and value to society.
If delta inflation exists, several approaches might minimize its impact. Firstly, not only should predicted delta be reported [40], but also should it be justified by a referenced review of available evidence or a statement about biological plausibility or the MCID, especially when predicted delta exceeds a nominal value such as 3% [18, 24]. Results of trials should report confidence intervals for delta rather than P values and should emphasize that the results excluded a difference greater than the upper confidence interval rather than stating that the results failed to find a statistically significant difference [11, 13, 37]. A 'buffer' to account for delta inflation could be built into power calculations as is now done for anticipated rates of drop out and loss to follow up. Moreover, the use of mortality as the only accepted primary outcome for trials of therapies for critical illness should be reconsidered, because few therapies in critical care are ultimately shown to reduce mortality [1, 23]. Consideration might be given to the use of composite [41] or weighted composite [42, 43] endpoints in which each part of the composite is weighted according to its relative value. For example, a composite endpoint might be comprised of mortality, renal replacement therapy, mechanical ventilation, nonambulatory status, or receiving nutritional support at some predetermined time point (e.g., 28 or 60 days). More research related to longterm outcomes in critical illness and their relative values will be needed to inform the choice of components of composite endpoints [44].
There are several limitations of our study. As we limited our search to five highimpact journals, it is possible that we have overestimated the prevalence of delta inflation because of omission of trials that more accurately predicted delta in other journals. This is unlikely because highimpact journals are more likely to publish 'positive' trials and those with larger sample sizes and larger effects, and thus our analysis may have underestimated the prevalence and impact of delta inflation. For the sake of homogeneity, we limited our analysis to critical care trials that utilized mortality as a primary endpoint, and therefore our findings may not be generalizable to trials in other specialties and those using other primary outcomes. Nonetheless, the same pressures faced by critical care investigators may be experienced by investigators in other fields pursuing other outcomes who may likewise be susceptible to delta inflation. Determination of the prevalence of delta inflation in other arenas will require specific study.
Conclusions
Delta inflation, a systematic overestimation in predictions of treatment effect size during trial design, is common in randomized controlled trials of mortality in critical care medicine. Reliable methods for predicting delta during study design and better reporting of the basis for these predictions are needed to minimize the risk of trial failure from type II statistical errors and resulting waste of research resources. Consideration should be given to designing such trials with other clinically meaningful primary endpoints. Critical care practitioners and investigators must be aware that because of delta inflation, negative results in randomized controlled trials do not rule out efficacy of the therapies evaluated.
Key messages

Most therapies for adult critical illness fail to demonstrate efficacy in randomized controlled trials.

In the design of randomized controlled trials, investigators must determine a realistic estimate of the effect size (delta) of the therapy on an outcome of interest such as mortality.

In randomized controlled trials in critical care, predicted delta almost always exceeds the delta observed in the trial data.

This 'delta inflation' is a potential reason that most such trials fail to demonstrate efficacy.

Critical care practitioners and investigators must bear in mind that 'absence of evidence is not evidence of absence'.
Notes
Supplementary material
References
 1.OspinaTascon GA, Buchele GL, Vincent JL: Multicenter, randomized, controlled trials evaluating mortality in intensive care: doomed to fail? Crit Care Med 2008, 36: 13111322. 10.1097/CCM.0b013e318168ea3eCrossRefPubMedGoogle Scholar
 2.Berghe G, Wouters P, Weekers F, Verwaest C, Bruyninckx F, Schetz M, Vlasselaers D, Ferdinande P, Lauwers P, Bouillon R: Intensive insulin therapy in critically ill patients. N Engl J Med 2001, 345: 13591367. 10.1056/NEJMoa011300CrossRefPubMedGoogle Scholar
 3.Berghe G, Wilmer A, Hermans G, Meersseman W, Wouters PJ, Milants I, Van Wijngaerden E, Bobbaers H, Bouillon R: Intensive insulin therapy in the medical ICU. N Engl J Med 2006, 354: 449461. 10.1056/NEJMoa052521CrossRefPubMedGoogle Scholar
 4.The NICESUGAR Study Investigators: Intensive versus conventional glucose control in critically ill patients. N Engl J Med 2009, 360: 12831297. 10.1056/NEJMoa0810625CrossRefGoogle Scholar
 5.Annane D, Sébille V, Charpentier C, Bollaert PE, François B, Korach JM, Capellier G, Cohen Y, Azoulay E, Troché G, ChaumetRiffaud P, Bellissant E: Effect of treatment with low doses of hydrocortisone and fludrocortisone on mortality in patients with septic shock. JAMA 2002, 288: 862871. 10.1001/jama.288.7.862CrossRefPubMedGoogle Scholar
 6.Sprung CL, Annane D, Keh D, Moreno R, Singer M, Freivogel K, Weiss YG, Benbenishty J, Kalenka A, Forst H, Laterre PF, Reinhart K, Cuthbertson BH, Payen D, Briegel J, CORTICUS Study Group: Hydrocortisone therapy for patients with septic shock. N Engl J Med 2008, 358: 111124. 10.1056/NEJMoa071366CrossRefPubMedGoogle Scholar
 7.Brunkhorst FM, Engel C, Bloos F, MeierHellmann A, Ragaller M, Weiler N, Moerer O, Gruendling M, Oppert M, Grond S, Olthoff D, Jaschinski U, John S, Rossaint R, Welte T, Schaefer M, Kern P, Kuhnt E, Kiehntopf M, Hartog C, Natanson C, Loeffler M, Reinhart K, German Competence Network Sepsis (SepNet): Intensive insulin therapy and pentastarch resuscitation in severe sepsis. N Engl J Med 2008,358(2):125139. 10.1056/NEJMoa070716CrossRefPubMedGoogle Scholar
 8.Charles P, Giraudeau B, Dechartres A, Baron G, Ravaud P: Reporting of sample size calculation in randomised controlled trials: review. BMJ 2009, 338: b1732. 10.1136/bmj.b1732PubMedCentralCrossRefPubMedGoogle Scholar
 9.Moher D, Schulz KF, Altman D, for the CONSORT Group: The CONSORT statement: revised recommendations for improving the quality of reports of parallelgroup randomized trials. JAMA 2001, 285: 19871991. 10.1001/jama.285.15.1987CrossRefPubMedGoogle Scholar
 10.Chan KB, ManSonHing M, Molnar FJ, Laupacis A: How well is the clinical importance of study results reported? An assessment of randomized controlled trials. CMAJ 2001, 165: 11971202.PubMedCentralPubMedGoogle Scholar
 11.Schulz KF, Grimes DA: Sample size calculations in randomised trials: mandatory and mystical. Lancet 2005, 365: 13481353. 10.1016/S01406736(05)610343CrossRefPubMedGoogle Scholar
 12.Matthews JN: Small clinical trials: are they all bad? Stat Med 1995, 14: 115126. 10.1002/sim.4780140204CrossRefPubMedGoogle Scholar
 13.Goodman SN, Berlin JA: The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Ann Intern Med 1994, 121: 200206.CrossRefPubMedGoogle Scholar
 14.Guyatt GH, Mills EJ, Elbourne D: In the era of systematic reviews, does the size of an individual trial still matter. PLoS Med 2008, 5: e4. 10.1371/journal.pmed.0050004PubMedCentralCrossRefPubMedGoogle Scholar
 15.Harvey S, Harrison DA, Singer M, Ashcroft J, Jones CM, Elbourne D, Brampton W, Williams D, Young D, Rowan K, PACMan study collaboration: Assessment of the clinical effectiveness of pulmonary artery catheters in management of patients in intensive care (PACMan): a randomised controlled trial. Lancet 2005, 366: 472477. 10.1016/S01406736(05)670614CrossRefPubMedGoogle Scholar
 16.The National Heart LaBIARDSACTN: Efficacy and safety of corticosteroids for persistent acute respiratory distress syndrome. N Engl J Med 2006, 354: 16711684. 10.1056/NEJMoa051693CrossRefGoogle Scholar
 17.Abraham E, Reinhart K, Opal S, Demeyer I, Doig C, Rodriguez AL, Beale R, Svoboda P, Laterre PF, Simon S, Light B, Spapen H, Stone J, Seibert A, Peckelsen C, De Deyne C, Postier R, Pettilä V, Artigas A, Percell SR, Shu V, Zwingelstein C, Tobias J, Poole L, Stolzenbach JC, Creasey AA, OPTIMIST Trial Study Group: Efficacy and safety of tifacogin (recombinant tissue factor pathway inhibitor) in severe sepsis: a randomized controlled trial. JAMA 2003, 290: 238247. 10.1001/jama.290.2.238CrossRefPubMedGoogle Scholar
 18.Finfer S, Bellomo R, Boyce N, French J, Myburgh J, Norton R, SAFE Study Investigators: A comparison of albumin and saline for fluid resuscitation in the intensive care unit. N Engl J Med 2004, 350: 22472256. 10.1056/NEJMoa040232CrossRefPubMedGoogle Scholar
 19.Bernard GR, Vincent JL, Laterre PF, LaRosa SP, Dhainaut JF, LopezRodriguez A, Steingrub JS, Garber GE, Helterbrand JD, Ely EW, Fisher CJ Jr, Recombinant human protein C Worldwide Evaluation in Severe Sepsis (PROWESS) study group: Efficacy and safety of recombinant human activated protein C for severe sepsis. N Engl J Med 2001, 344: 699709. 10.1056/NEJM200103083441001CrossRefPubMedGoogle Scholar
 20.Rivers E, Nguyen B, Havstad S, Ressler J, Muzzin A, Knoblich B, Peterson E, Tomlanovich M, Early GoalDirected Therapy Collaborative Group: Early goaldirected therapy in the treatment of severe sepsis and septic shock. N Engl J Med 2001, 345: 13681377. 10.1056/NEJMoa010307CrossRefPubMedGoogle Scholar
 21.Esteban A, Anzueto A, Frutos F, Alía I, Brochard L, Stewart TE, Benito S, Epstein SK, Apezteguía C, Nightingale P, Arroliga AC, Tobin MJ, Mechanical Ventilation International Study Group: Characteristics and outcomes in adult patients receiving mechanical ventilation: a 28day international study. JAMA 2002, 287: 345355. 10.1001/jama.287.3.345CrossRefPubMedGoogle Scholar
 22.Sloan EP, Koenigsberg M, Gens D, Cipolle M, Runge J, Mallory MN, Rodman G Jr: Diaspirin crosslinked hemoglobin (DCLHb) in the treatment of severe traumatic hemorrhagic shock: a randomized controlled efficacy trial. JAMA 1999, 282: 18571864. 10.1001/jama.282.19.1857CrossRefPubMedGoogle Scholar
 23.Weaver CS, LeonardiBee J, BathHextall FJ, Bath PM: Sample size calculations in acute stroke trials: a systematic review of their reporting, characteristics, and relationship with outcome. Stroke 2004, 35: 12161224. 10.1161/01.STR.0000125010.70652.93CrossRefPubMedGoogle Scholar
 24.Raju TN, Langenberg P, Sen A, Aldana O: How much 'better' is good enough? The magnitude of treatment effect in clinical trials. Am J Dis Child 1992, 146: 407411.CrossRefPubMedGoogle Scholar
 25.Moher D, Dulberg CS, Wells GA: Statistical power, sample size, and their reporting in randomized controlled trials. JAMA 1994, 272: 122124. 10.1001/jama.272.2.122CrossRefPubMedGoogle Scholar
 26.Chan AW, Hrobjartsson A, Jorgensen KJ, Gotzsche PC, Altman DG: Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols. BMJ 2008, 337: a2299. 10.1136/bmj.a2299PubMedCentralCrossRefPubMedGoogle Scholar
 27.Chalmers I, Matthews R: What are the implications of optimism bias in clinical research? Lancet 2006, 367: 449450. 10.1016/S01406736(06)681531CrossRefPubMedGoogle Scholar
 28.Aberegg SK, O'Brien J Jr, Khoury P, Patel R, Arkes HR: The influence of treatment effect size on willingness to adopt a therapy. Med Decis Making 2009, 29: 599605. 10.1177/0272989X09336078CrossRefPubMedGoogle Scholar
 29.Gould ALL: Planning and revising the sample size for a trial. Stat Med 1995, 14: 10391051. 10.1002/sim.4780140922CrossRefPubMedGoogle Scholar
 30.Naylor CD, LlewellynThomas HA: Can there be a more patientcentred approach to determining clinically important effect sizes for randomized treatment trials? J Clin Epidemiol 1994, 47: 787795. 10.1016/08954356(94)901767CrossRefPubMedGoogle Scholar
 31.Chan KB, ManSonHing M, Molnar FJ, Laupacis A: How well is the clinical importance of study results reported? An assessment of randomized controlled trials. CMAJ 2001, 165: 11971202.PubMedCentralPubMedGoogle Scholar
 32.Decullier E, Chan AW, Chapuis F: Inadequate dissemination of phase I trials: a retrospective cohort study. PLoS Med 2009, 6: e1000034. 10.1371/journal.pmed.1000034PubMedCentralCrossRefPubMedGoogle Scholar
 33.Bedard PL, Krzyzanowska MK, Pintilie M, Tannock IF: Statistical power of negative randomized controlled trials presented at American Society for Clinical Oncology annual meetings. J Clin Oncol 2007, 25: 34823487. 10.1200/JCO.2007.11.3670CrossRefPubMedGoogle Scholar
 34.Hebert RS, Wright SM, Dittus RS, Elasy TA: Prominent medical journals often provide insufficient information to assess the validity of studies with negative results. J Negat Results Biomed 2002, 1: 1. 10.1186/1477575111PubMedCentralCrossRefPubMedGoogle Scholar
 35.National Heart, Lung, and Blood Institute Acute Respiratory Distress Syndrome (ARDS) Clinical Trials Network, Wiedemann HP, Wheeler AP, Bernard GR, Thompson BT, Hayden D, deBoisblanc B, Connors AF Jr, Hite RD, Harabin AL: Comparison of two fluidmanagement strategies in acute lung injury. N Engl J Med 2006, 354: 25642575. 10.1056/NEJMoa062200CrossRefGoogle Scholar
 36.Rivers EP: Fluidmanagement strategies in acute lung injury  liberal, conservative, or both? N Engl J Med 2006, 354: 25982600. 10.1056/NEJMe068105CrossRefPubMedGoogle Scholar
 37.Feinstein AR, Concato J: The quest for "power": contradictory hypotheses and inflated sample sizes. J Clin Epidemiol 1998, 51: 537545. 10.1016/S08954356(98)000298CrossRefPubMedGoogle Scholar
 38.Halpern SD, Karlawish JH, Berlin JA: The continuing unethical conduct of underpowered clinical trials. JAMA 2002, 288: 358362. 10.1001/jama.288.3.358CrossRefPubMedGoogle Scholar
 39.Altman DG, Bland JM: Absence of evidence is not evidence of absence. Aust Vet J 1996, 74: 311. 10.1111/j.17510813.1996.tb13786.xCrossRefPubMedGoogle Scholar
 40.Ioannidis JP, Evans SJ, Gøtzsche PC, O'Neill RT, Altman DG, Schulz K, Moher D, CONSORT Group: Better reporting of harms in randomized trials: an extension of the CONSORT statement. Ann Intern Med 2004, 141: 781788.CrossRefPubMedGoogle Scholar
 41.Freemantle N, Calvert M, Wood J, Eastaugh J, Griffin C: Composite outcomes in randomized trials: greater precision but with greater uncertainty? JAMA 2003, 289: 25542559. 10.1001/jama.289.19.2554CrossRefPubMedGoogle Scholar
 42.Lim E, Brown A, Helmy A, Mussa S, Altman DG: Composite outcomes in cardiovascular research: a survey of randomized trials. Ann Intern Med 2008, 149: 612617.CrossRefPubMedGoogle Scholar
 43.FerreiraGonzález I, Busse JW, HeelsAnsdell D, Montori VM, Akl EA, Bryant DM, AlonsoCoello P, Alonso J, Worster A, Upadhye S, Jaeschke R, Schünemann HJ, PermanyerMiralda G, PachecoHuergo V, DomingoSalvany A, Wu P, Mills EJ, Guyatt GH: Problems with use of composite end points in cardiovascular trials: systematic review of randomised controlled trials. BMJ 2007, 334: 786. 10.1136/bmj.39136.682083.AEPubMedCentralCrossRefPubMedGoogle Scholar
 44.Dowdy DW, Eid MP, Sedrakyan A, MendezTellez PA, Pronovost PJ, Herridge MS, Needham DM: Quality of life in adult survivors of critical illness: a systematic review of the literature. Intensive Care Med 2005, 31: 611620. 10.1007/s0013400525926CrossRefPubMedGoogle Scholar
Copyright information
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.