Introduction

Anterior cruciate ligament (ACL) rupture is a devastating injury that can lead to recurrent instability, chronic pain, and degenerative changes in the knee [58, 42]. Arthroscopic reconstruction is the standard approach, but controversy remains over the most favorable graft selection. The most commonly utilized autografts for ACL reconstruction in the United States are the bone-patellar tendon-bone (bone-tendon-bone) and the four-strand hamstring tendon (hamstring) [13, 43]. Some authors suggest that bone-tendon-bone autograft is the most favorable graft choice because of faster graft incorporation [33], a higher proportion of patients returning to preinjury activity levels [47], and potentially a lower risk of graft rupture [32]. However, others favor hamstring autografts because of lower donor site morbidity, anterior knee pain, extensor strength deficit, and osteoarthritis [11, 27, 32, 39, 42].

Although controversy remains on specific advantages and disadvantages, clearly the most significant adverse outcome after ACL reconstruction is graft rupture and subsequent revision surgery. There have been a number of prospective studies that compared bone-tendon-bone and hamstring that demonstrated similar risk of rupture between the two graft types [2, 4, 21, 30, 38, 40, 41, 46]. A 2014 meta-analysis by Xie et al. [47] effectively summarized the literature before 2013 with roughly 1000 patients in each group and found no difference in the risk of graft rupture.

In subsequent years, there have been multiple prospective studies published on this topic [3, 13, 34, 35]. The most notable has been data on greater than 40,000 patients from Scandinavian ACL registries, in which researchers identified a higher risk of rupture/revision for patients receiving hamstring than those receiving bone-tendon-bone [15]. This finding directly contradicts what was reported by Xie et al. in their 2014 meta-analysis [47] and thus raises new questions about failure rate after ACL reconstruction. It remains unclear whether there is an inherent difference in graft type that prior studies have been underpowered to detect or whether the difference found by Gifstad et al. [13] is a uniquely Scandinavian phenomenon that is not generalizable to the orthopaedic community at large.

The goal of this study was to perform an up-to-date meta-analysis, incorporating high-quality evidence from randomized controlled trials (RCTs), prospective comparative studies, and national registries in an attempt to address the aforementioned questions. Specifically, we asked which approach to ACL reconstruction (bone-tendon-bone or hamstring) has a higher risk of (1) graft rupture and/or (2) graft laxity?

Materials and Methods

This study considered for inclusion RCTs, prospective comparative studies, and large national registries with prospective data collection comparing bone-tendon-bone autograft and hamstring autograft in primary ACL reconstruction (Table 1).

Table 1 Summary of included studies

Inclusion criteria were minimum 2-year followup, documentation of graft failure rate (graft rupture or revision ACL reconstruction), and/or documentation of measures of graft laxity in the form of instrumented laxity (KT1000/2000™; MEDmetric Corp, San Diego, CA, USA), pivot shift test, or Lachman test among surviving grafts. Exclusion criteria for studies included case reports, single-center retrospective comparative studies, narrative reviews, or image reviews. Studies were not excluded based on patient demographics or surgical technique. When articles that reported on the same patient group were identified, the most recent data were utilized.

Search Strategy

Search engines utilized were PubMed, Cochrane Library, MEDLINE, and EMBASE. All articles that were not in English were excluded. Further studies were found utilizing the reference lists of the studies initially identified. Search terms included “ACL” or “anterior cruciate ligament” in combination with “hamstring autograft” or “semitendinosus” and “bone tendon bone autograft” or “patellar tendon”.

After duplicates were removed, 136 abstracts were screened. The titles and abstracts of all records retrieved by the search were assessed independently by two authors (BTS, NRJ). Any disagreements were resolved by discussion and arbitration by the three senior authors (KEW, TEH, AJK). After initial screening of these abstracts, two authors (BTS, NRJ) assessed the remaining full-text articles for inclusion in the study (Fig. 1).

Fig. 1
figure 1

PRISMA diagram shows how the final studies included were obtained. Overall, there were 14 RCTs, 10 prospective cohort studies, and one national registry study.

Studies meeting inclusion criteria were assessed for quality by one of two methods. RCTs were assessed using the Jadad scale [23], whereas prospective comparative studies and registries were assessed using the Modified Coleman Methodology Scores (MCMS) (range, 0–90) [9]. Interrater reliability was assessed with the Pearson correlation coefficient. Values of correlation range from + 1 (perfect positive correlation) to -1 (perfect negative correlation) with a value greater than + 0.7 considered a strong positive correlation. This method has been utilized in the past to assess the validity of interrater reliability when using the MCMS in assessing the quality of articles reviewed [18]. The mean MCMS score of the included studies was 79 ± 5 and the median Jadad for the RCTs included was 3 (range, 3–5) (Table 1). Blinded review of the selected articles was conducted to ensure interrater reliability, and a correlation coefficient of 0.79 was found for MCMS scores and 0.89 for Jadad scores. The majority of studies lost points from their MCMS scores as a result of incomplete followup and the study investigators being the surgeons who performed the operations.

Two authors (BTS, NRJ) independently extracted data from each included study. Any discrepancies were resolved by a third author (KEW). Patient demographics were extracted from each study as well as the number of patients in each study group (bone-tendon-bone versus hamstring). Specific outcomes measured were failure rates (defined as graft rupture or revision ACL reconstruction) in addition to instrumented laxity (positive defined as ≥ 3 mm), pivot shift (positive defined as 1+ or greater), and Lachman test (positive defined as 1+ or greater) in surviving grafts.

Outcomes Measured

There were 47,613 patients from 14 RCTs, 10 prospective comparative studies, and one national registry included in the meta-analysis (Table 1). Mean age for patients who underwent ACL reconstruction with bone-tendon-bone was 28 ± 3 years versus 28 ± 4 years for those who received hamstring autografts. Sixty-three percent of patients in the bone-tendon-bone cohort were men versus 57% of patients in the hamstring cohort. Average followup was 68 ± 55 months for the included studies.

Nineteen studies that included a total of 47,070 (7560 bone-tendon-bone, 39,510 hamstring) patients reported on graft rupture and/or revision ACL reconstruction. Seven studies, including 568 patients (280 bone-tendon-bone, 288 hamstring), reported Lachman test data. Fifteen studies reported data on instrumented laxity as defined by patients with a side-to-side difference of ≥ 3 mm at manual maximum testing with KT1000/2000™. This included 6216 patients (1433 bone-tendon-bone, 4783 hamstring). Overall, 13 studies reported on pivot shift testing and 6570 patients were included (1508 bone-tendon-bone, 5062 hamstring). Pivot shift was denoted as positive if patients had a grade greater than 1+ on physical examination.

Statistical Analysis

Data analysis was performed with Review Manager (Version 5.3; The Cochrane Collaboration, London, UK). Odds ratio was used as summary statistics for dichotomous variables. Odds ratios (ORs) were reported with 95% confidence intervals, and statistical significance was set to a p value of < 0.05. Statistical heterogeneity between included studies was evaluated by the I2 and chi square tests with significance set at p < 0.10. The number needed to treat (NNT) was calculated utilizing observed relative risks and the patient’s expected event rate [12].

Random-effects or fixed-effects models were used depending on heterogeneity of the study. Random-effects modeling was utilized for I2 values of > 25% [20]. A sensitivity analysis was performed by excluding one study in each round and evaluating the influence of any single study on the primary meta-analysis estimate. Subgroup analysis was also performed based on type of study design (RCT versus prospective cohort studies [PCS]) to identify potential differences between bone-tendon-bone and hamstring grafts across trials.

Publication Bias

A funnel plot was created for each outcome measure to determine if there was bias in the studies. These plots showed no evidence of positive outcome bias in any of the outcomes measured and were relatively symmetric (Appendix 1 [Supplemental materials are available with the online version of CORR ®.]).

Results

The meta-analysis found that patients undergoing primary ACL reconstruction with bone-tendon-bone autograft were less likely to experience graft rupture and/or revision ACL reconstruction than patients treated with hamstring autograft (OR, 0.83; 95% confidence interval [CI], 0.72–0.96; p = 0.01) (Fig. 2). A NNT analysis showed that 235 patients would need to be treated with a bone-tendon-bone graft over a hamstring graft to prevent one graft rupture.

Fig. 2
figure 2

Analysis of failure rate between the studies is demonstrated. OR for graft failure was 0.83 favoring bone-tendon-bone (95% CI, 0.72-0.96; p = 0.01). BPTB = bone-patellar-tendon bone; 4SHT = four-stranded hamstring tendon; M-H = Mantel-Haenszel.

Among patients who did not experience graft rupture or revision, there were no differences observed between the two graft types in any of our secondary analyses of graft laxity. No difference was found based on KT1000/2000™ testing between bone-tendon-bone and hamstring (OR, 0.86; 95% CI, 0.69–1.06; p = 0.16) (Fig. 3), pivot shift testing (OR, 0.89; 95% CI, 0.63–1.25; p = 0.51) (Fig. 4), or Lachman testing (OR, 0.96; 95% CI, 0.67–1.39; p = 0.84) (Fig. 5).

Fig. 3
figure 3

Instrumented laxity was measured by KT1000/2000™. Positive test equals ≥ 3-mm side-to-side difference. No major difference was observed between either group in the study.

Fig. 4
figure 4

Pivot shift analysis between the studies is shown. No major difference was observed between either group in terms of the study.

Fig. 5
figure 5

Lachman testing analysis between the studies is shown. No major difference was observed between either group in terms in the study.

A subgroup analysis of the outcomes included in the study was performed according to the type of study design (PCS or RCT). The sole finding of difference was an OR of 0.76 favoring bone-tendon-bone over hamstring when instrumented laxity in RCTs alone was analyzed (95% CI, 0.62-0.92; p = 0.006); there were no differences in terms of Lachman or pivot shift testing (Table 2).

Table 2 Subgroup analysis

Discussion

Graft rupture is a feared complication after ACL reconstruction, because revision surgery often results in inferior patient-reported outcome measures, increased laxity with pivot shift testing, and increased rates of tibiofemoral arthritis [15]. Although bone-tendon-bone and hamstring autograft are common graft choices, the available evidence is mixed on which graft type is associated with a higher risk of graft rupture and revision ACL reconstruction. A recent large Scandiavian registry study [13] reported a higher risk of graft rupture with hamstring than bone-tendon-bone, but a prior meta-analysis [47] found no difference between graft types. The goals of this study were to utilize the statistical power of a meta-analysis to compare the risk of graft rupture/revision and metrics of graft laxity between graft types. We found a small increased risk of graft rupture/revision in the hamstring group when compared with the bone-tendon-bone group. The NNT was calculated at 235, meaning that 235 patients would need to be treated with bone-tendon-bone rather than hamstring to prevent one rupture. We observed few differences between graft types in terms of graft laxity reported by the primary source studies.

We interpret this study in light of several limitations. First, most of the patients included in this study came from the Scandinavian registry studies and those results factor heavily into the results of the current study. Although the methodology for a registry study is not as rigorous as that of an RCT, the data were collected longitudinally, and the authors have published on the validity of their registry [15]. The inclusion of these data was critical to this study as we attempted to determine whether the results of the registry studies would hold up in the setting of a larger meta-analysis. Second, this study was designed to investigate differences in the risk of graft rupture between the two graft types without considering clinical outcome scores and knee function. This fact keeps the study clear and focused but precludes its use in isolation when a clinician debates the merits of bone-tendon-bone versus hamstring autograft. We believe that the small increased risk of graft rupture observed in this meta-analysis should be only one aspect of a larger discussion with patients regarding optimal graft choice. Another limitation is that the nature of a meta-analysis is such that the authors can only analyze and interpret the data available in light of heterogeneous reporting of data across studies. This fact limits the number of patients available for analysis within the individual outcomes studied but does not alter the validity of the results. Lastly, the relatively short-term minimum followup in some of the included studies prevents us from evaluation of the long-term outcomes between graft types.

In this study, patients who received bone-tendon-bone autograft had a reduced risk of graft failure compared with those receiving hamstring autograft at a minimum 2-year followup with a NNT of 235 patients. These data support the recent findings from the Scandinavian ACL registries [3, 13, 34, 35] and by Maletis et al. in their retrospective registry review [29]. However, the observed difference in this study was small and clinicians must weigh the clinical importance of this finding while also considering differences in donor site morbidity, patient-reported outcome metrics, and knee function scores. All of these areas represent opportunities for continued research.

Secondary outcome measures of graft laxity included KT1000/2000™ instrumented laxity testing, pivot shift test, and the Lachman test. Instrumented laxity testing favored bone-tendon-bone when including RCTs in isolation; however, on examination of the entire cohort, none of these measures was found to be different between groups in the current study. This is in contrast to the meta-analysis by Xie et al. [47] that reported that pivot shift testing favored bone-tendon-bone and the Cochrane review by Mohtadi et al. [32] in which they found all three of these factors favored bone-tendon-bone. The absence of a difference in graft laxity between groups in this study indicates that previously observed differences might have been the result of sampling bias or recent improvements in surgical technique. Newer generation fixation devices for hamstring ACL reconstruction are designed to allow for more anatomic tunnel placement without compromising load to failure [10, 17]. The routine pretensioning of hamstring grafts also may contribute to less graft laxity regardless of graft type [24].

In this meta-analysis of short- to mid-term followup after primary ACL reconstruction, hamstring autografts failed at a higher rate than bone-tendon-bone autografts. However, failure rates were low in each group, the difference observed was small, and the grafts performed similarly in metrics of graft laxity. Both graft types remain viable options for primary ACL reconstruction, and the difference in failure rate should be one part of a larger conversation with each individual patient about graft selection that should also include potential differences in donor site morbidity, complication rates, and patient-reported outcome measures. Continued prospective collection of patient data will be important going forward as we attempt to further characterize the potential differences in outcomes attributable to graft selection.