Randomization is not a scientific method; it is an invaluable statistical strategy for the mathematical exploitation of uncertainty

A.R. Feinstein Ann Intern Med 1994;120:799-805.

Introduction

The recent statement of the Fondazione Umberto Veronesi ethics committee concerning the role of randomization in clinical research trials is one of the many voices in a wider debate on the necessity of changes in the domain of randomized clinical trials [1]. Since 1948, when the first randomized trial was conducted by Austin Bradford Hill et al., testing the effectiveness of Streptomycin in pulmonary tuberculosis against standard treatment, [2] randomized clinical trials (RCTs) are considered to be the most reliable method for comparing efficacy of different healthcare interventions. However, there are major concerns on the over abundant obstacles in the conduct of these studies, particularly the ever increasing onerous regulation and related bureaucracy that makes them more and more difficult to carry out [3, 4]. Moreover, from a purely clinical point of view, the major problem is translating the RCT evidence produced “on averages” into clinical decisions regarding individual patients [57]. The paper from Fondazione Umberto Veronesi focuses upon the inherent ethical issues, mainly randomization, raising many questions, but also proposing possible solutions.

The first question is really basic, it has to do with the ethicality of using patients as tools in reaching a scientific goal. Patients enrolled in a RCT are used to improve medical knowledge, which maybe useful for other patients, but they cannot be the users of the results of the trials in which they are participating. The authors emphasize the fact that the patients assigned to the control arm certainly cannot have any benefit from the new treatment, even if it were to be proven effective. They actually are “discriminated against,” given that the study design prevents the patients in the control arm from receiving potentially innovative treatment.

The second question is about the possible harm from the effects of incomplete communication in giving patients information on the treatment actually received, due to the blinding usually associated with randomization. Their consent to the treatment they receive, given in blindness, has an intrinsically lower value.

The third concern deals with the long time needed to complete classically designed RCT, with the consequent delay in making available for clinical practice any treatment that might possibly be more effective than the standard care.

Even if the authors claim to acknowledge the scientific validity, as well as the necessity for randomization, they actually more emphatically emphasize the ethical conflicts, and the consequent need for penetrating solutions, opening the need for a wide discussion. Their suggestions are:

  • A wider use of clinical and administrative databases. Databases should be able to store more and more data, faster and faster, which is easily manageable. On this basis, they suggest that clinical (observational) studies should be quicker and simpler, allowing the planning of more focused and ethically correct studies.

  • An early (as soon as possible,) termination of every RCT. A RCT should always be planned to be concluded with a minimum number of patients, with a minimum lapse of time, and also consenting with fast availability, for the patients in the control arm of the experimental treatment, if it proves to be effective.

The authors conclude that methodological research for innovative approaches must be encouraged to be increased, and the use of randomization should be reduced. Even if in the discussion there are some attempts to acknowledge different opinions, and some contentious assertions are mitigated, the whole paper seems biased by an excess of optimism, and reliance on the efficacy of new treatments.

We will discuss this statement, and in particular, challenge the points concerning the ethical problem of randomization.

The equipoise principle: genuine uncertainty within the expert medical community about the preferred treatment

Considering scientific literature, only a minority (less than 15 %) of experimental treatments evaluated in a clinical context prove to be more effective than standard care [8]. On the contrary, the list of promising and subsequently proved to be detrimental drugs is very long, and the coxib story is only a recent example [9]. Before being introduced into the clinical practice, a new treatment deserves very careful scrutiny, and a very hard and long journey from bench to bedside has to be taken, because it is essential for patient safety. No drug, no intervention, or any diagnostic test as well should be introduced into practice before showing actual effectiveness on relevant clinical outcomes [10]. This requires adequate studies, large samples, and well-conducted RCTs. The evidence of effectiveness of an innovative treatment must never be considered self- proven: it’s not enough to be new to be effective. There are no good scientific reasons for claiming a “promising” treatment to be more effective than standard, without appropriate evidence from clinical studies. Moreover, it is even harder to understand why anyone would embark on a RCT if the efficacy of the new treatment was already proven and “glaringly obvious.” Any “demonstrative” study has to be considered unethical. Hence, the enrollment into a RCT control arm would be detrimental for patients only if the new treatment had already been demonstrated to be more effective than the comparator. Moreover, such a study would be unethical not because of randomization, but because of futility. The scientific and ethical reasons for embarking on a clinical trial are the actual absence of evidence (clinical equipoise). Randomization, allocation concealment and blinding are the main parameters used in evaluating the quality of a RCT (internal validity), which allow clinicians to obtain sound, unbiased evidence that we cannot know in advance.

From the patient’s point of view, this concept cannot be easy to understand, and this misunderstanding could affect patient–physician relationships. Uncertainty is intrinsic to the practice of medicine, even though patients receive a lot of reassuring messages about the accuracy of medical diagnoses and the efficacy of treatments. Learning to cope with uncertainty is critical for clinical reasoning, such as empathically discussing uncertainty with patients for preventing any impairment of their confidence and trust. Administering a treatment according to chance, i.e., according to a randomization table, could actually sound very far from either the patient’s or the physician’s expectations, and even seem to be illogical and unethical, but in the face of actual uncertainty, may lead to the improvement of medical knowledge minimizing possible harm for patients.

On the other side of RCT detractors, there are some enthusiastic supporters who claim that all interventions need to be validated by a randomized controlled trial. In certain cases, the most often quoted being the parachute example, [11], a RCT designed to assess efficacy of an intervention may actually be absurd, and consequently unethical. As was suggested in the putative parachute randomized trial, “Individuals who insist that all interventions need to be validated by a randomized controlled trial need to come down to earth with a bump” [11]. Nevertheless, interventions as effective as is a parachute in preventing “death and trauma related to gravitational challenge,” are not abundant, and probably fewer than the authors of the U. Veronesi foundation ethics al committee statement assume [8].

Uncertainty, as a basis for both ethical and scientific reasons in RCTs, is a concept clearly discussed in a recent study in the oncology field. In this careful analysis of the results of RCTs conducted by the National Cancer Institute cooperative groups since their inception in 1955, the pooled estimate of hazard ratios for overall and event-free survival was 0.95 (99 % CI 0.93–0.98) and 0.90 (99 % CI, 0.87–0.93) with only a slight favoring of new treatments [12]. These results are consistent with the hypothesis that the ethical principle of equipoise defines limits of discoverability in clinical research, and ultimately drives therapeutic advances in clinical medicine. Planning and conducting these studies, the physicians (as well as the patients), were uncertain about the actual effects of the new versus the standard treatment with which it was to be compared. The expected probability of the new treatment to be successful describes the limits within which a study can be acceptable both from the ethical and the scientific point of view. Most people are willing to be enrolled in a RCT if the probability of success of the experimental treatment is between 50 and 70 %, [13] and also most institutional board members would only approve trials with an expected probability of success between 40 and 60 % [14]. These levels of expected probability of success reflect a genuine uncertainty (the equipoise principle), and minimize possible harm for patients in both arms.

The possible harms of randomization

When the equipoise principle is respected, randomization allows to one to obtain sound unbiased comparisons between treatments. The harms and benefits of being enrolled in a RCT should be balanced, and it is both scientific and ethical to minimize placebo use as opposed to the best standard care of the control arm. A recent systematic review including more than 140,000 patients suggests that participation in RCTs is associated with similar outcomes to receiving the same treatment outside RCTs [15]. Furthermore, some studies even support the hypothesis that hospitals that participate in RCTs compared to hospitals that do not may offer better outcomes [16]. Therefore, it can be argued that RCTs might not be so dangerous, and no specific protection from randomization is needed, whenever the uncertainty principle is respected.

Big data and clinical research

A wider use of clinical databases, even if very useful in phase four post-marketing studies, cannot replace RCTs: non- randomized comparisons cannot warrant that different outcomes depend on a unique variable i.e., different treatments. In fact, high-quality registries, which collect standardized data from patients seen in a variety of settings, are nowadays available for assessing comparative effectiveness; however, despite statistical advances, observational registry studies are still considered suspect just because they lack the rigor of randomization [17, 18]. The “by indication bias” is really difficult to eliminate without randomization, even when using the “propensity-score,” a statistical approach that allows one to adjust comparisons, minimizing the effect of known possible confounders.

Registry-based randomized trials can offer a solution: investigators are quickly able to identify potential participants by gathering clinical information from preexisting databases, and can enroll thousands of patients in little time, avoiding long case-report forms and obtaining accurate follow-up with minimal effort [19]. An example of such a registry-based trial is the recent Thrombus Aspiration in ST-elevation myocardial infarction in Scandinavia (TASTE) trial, a large-scale study to answer an important clinical question, and carried out at remarkably low cost by building on the platform of an already-existing high-quality observational registry [20]. The trial is still a trial, a rigorous randomized experiment that isolates a causal link (or the absence of one), between a treatment and an outcome. The investigators were able to enroll a large number of patients, thus offering the clinicians’ insights that are potentially based on a representative sample, a real-world population created from consecutively enrolled registry patients. This highly efficient design raises some questions that are at the same time ethical and scientific: what are the best populations or sub-populations to study to assure representativeness and balance between efficacy versus effectiveness, how to approach concerns about privacy and informed consent, how to assure enough database quality by reducing missing data, and obtaining long-term follow-up? [19]

Furthermore, the problem is not only to hasten studies, but to put them into context: each study should clearly define the pre-existing available evidence to plan an adequate design and build upon a continuous improvement of knowledge, not just a simple sum of data [21]. Hence, one could hypothesize that just the availability of large database and the actual perspective of leading large studies would remove some obstacles and enhance the very use of RCT to solve more and more clinical questions.

The benefit of early stopping RCTs

Finally, the generalized rule of benefit from the early stopping of RCTs is questionable. A systematic review shows that early stopped trials demonstrate implausibly large treatment effects, particularly when the number of events is small [22]. Because of the overestimation of effect size, and the underestimation of the adverse events rate, an early stopped study cannot support a balanced decision, and ends up being useless, consequently unethical. The definition and planning of stopping rules must prevent patient harm, and also preserve the validity of the results. Turning back to the process of approval of investigational drugs, clarifying the benefit–risk relationship is actually a complex pathway: evidence for both efficacy and safety is needed. Too many attempts to make this pathway shorter produced more harm than benefit. According to the USA experience, fast track or accelerated pathways seem to have only reduced the quantity of needed evidence, and altered the nature of the needed evidence [23]. For example, cancer drugs approved during the previous decade on the basis of limited clinical trials—nonrandomized, unblinded, single group, phase 1 and phase 2 trials that used intermediate end points rather than patient survival-, had a 72 % greater odds of serious adverse events occurring in their pivotal trials than did cancer drugs that were approved with more rigorous studies [24]. Such expedited approvals, with deferral of more rigorous studies, undermine and delay evaluation of the benefit risk profile of an investigational drug. Once a drug is approved, enrolling patients in clinical trials to determine efficacy is more challenging than before approval, because patients have the choice of receiving the drug in the normal course of therapy versus enrolling in a trial in which they may be randomly assigned to usual care [23]. Moreover, it seems difficult for patients to make informed and balanced choice about “breakthrough drugs” approved with new clinical trial techniques rather than with traditional randomized trials, especially considering that it is the very patient with life-threatening illness who seems to underestimate the risk associated with therapy [25].

Research priorities

Biostatistics research might actually improve studies design in answering clinical questions, but discarding randomization does not seem to be the correct priority. Reducing inconclusive trials, which actually are estimated to be 30 % of all studies [12], might be even more relevant. Such a high level of inefficiency seems to be related to the researchers’ overly optimistic assessment of the benefit of treatment rather than to difficulties in the accrual of patients. More realistic estimates of the RCT sample size are needed entailing a clearer definition of the alternative hypothesis (what we mean by: “relevant effect size” and why we want to protect against the “type 1 error” more than against the “type 2 error”)? The sample size debate should also take into account the frequent use of meta-analyses [26, 27]. Finally, considering applying summary results of RCT to individual patients, not usually well represented within the “study population” (exclusion by comorbidity), is really of paramount relevance, for methodological, clinical and ethical points of view, to personalize therapy, including in the data analysis, personal risk modeling with appropriate covariates. The aim is not only to adjust comparisons but also to decide on individual patients [7]. Just according to the ethical need of considering patients as the aim of research, a clear definition of patient-centered outcomes should be developed: that is clinical research studies should face problems and questions from the perspective of the patients and their values [28]. Formal statistical significance by itself does not mean clinical relevance, which depends only on patients’ actual benefits. Large trials can show differences between the two arms that are statistically significant but clinically not relevant [29]. Even well-conducted RCT do not always properly take into account patients’ needs and perspectives: clinical outcomes are not chosen by patients, and may be relevant only for clinicians [30, 31]. Even clinical outcomes such as death or hospitalization are in some sense “intermediate” outcomes, until weighted by patient’s preferences.

From the ethical point of view, it seems particularly relevant to pay attention to the sponsorship of the studies, conflict of interest disclosure, authorship and accountability of reporting, completeness of reporting results, even in case of “negative” studies [32, 33]. Sponsorship of drug and device studies by the manufacturing company leads to more favorable results and conclusions than sponsorship by other sources [34]. Even the conclusions in trials funded by for-profit organizations may be more positive due to biased interpretation of trial results [35].

Conclusions

In conclusion, the assumption and concern in the statement of the Fondazione Umberto Veronesi ethics committee that there is a conflict between “scientific” and “ethical” aspects of a clinical trial due to randomization should at least be mitigated, considering that only scientifically sound studies can be considered ethical. Randomization remains a scientific and ethic approach to ensure the internal validity of a study, assuming that the equipoise principle is respected. Other aspects than randomization seem to be more important, from the ethical point of view, considering RCT and their publication [3].