FormalPara Key Summary Points

Why carry out this study?

Anogenital warts (AGWs) are one of the most common sexually transmitted diseases, with an overall prevalence rate of around 1–5%.

No clinically meaningful hierarchy of first-line treatments for anogenital warts is provided in international guidelines.

What was learned from the study?

Based on a low level of evidence, surgery and electrosurgery achieved the best complete lesion response after clearance and recurrence assessment.

Podophyllotoxin 0.5% was the most efficacious patient-administered treatment.

Introduction

Anogenital warts (AGWs) are benign epithelial skin lesions that are predominantly caused by the human papillomavirus (HPV types 6 and 11), but are sometimes associated with other types of oncogenic HPV [1]. With an overall prevalence rate of around 1–5%, they are one of the most common sexually transmitted infections [2]. AGWs are usually asymptomatic, but they can be painful or pruritic and can cause significant psychosocial distress depending on size and location [3, 4]. Numerous HPV vaccination campaigns have been conducted, but few studies have demonstrated their efficacy in reducing the number of AGWs [5]. Moreover, in most countries, vaccination coverage is partial and has yet to be extended to men [6, 7].

Many treatments are available to treat AGWs. These can be divided into provider-administered treatments (ProTs) [trichloroacetic acid (TCA), podophyllin resin, CO2 laser surgery, cryotherapy, surgical excision, electrosurgery, intralesional therapy, etc.] and patient-administered treatments (PaTs) [podophyllotoxin, imiquimod, sinecatechins, 5-fluorouracil (5-FU) cream, etc.]. The latest guidelines [8,9,10,11] recommend that treatment of AGWs be adapted to: size, number, and anatomic site of AGWs; patient preference; convenience; adverse effects; cost of treatment; and physician experience. These recommendations, however, are based on head-to-head randomized trials or on expert advice. Furthermore, RCTs comparing several major treatments for AGWs (cryotherapy [12] vs podophyllotoxin cream or gel, imiquimod vs TCA, CO2 laser vs surgery or electrosurgery, etc.) are lacking [10, 13,14,15] and may never be performed (because they are costly, time-consuming, less attractive than new treatments, etc.). Reliable evidence on the comparative efficacy of these treatments is nevertheless needed to make informed clinical decisions.

In this context, network meta-analyses (NMAs) can help compare the relative benefits associated with different types of intervention for the same disease [16, 17]. The only NMA on AGWs, which was conducted by Barton et al. [13] based on a systematic review up to March 2018, concluded that ablative techniques were superior; it also found podophyllotoxin 0.5% gel to be the most cost-effective topical treatment. However, this NMA did not examine sinecatechins, 5-FU cream, and several RCTs on new treatments [citric acid, intralesional bleomycin, potassium hydroxide (KOH), photodynamic therapy (PDT), etc.].

Our NMA aims to establish a clinically meaningful hierarchy of PaTs and ProTs for the management of AGWs.

Methods

The study protocol is registered with PROSPERO (No. CRD42015025827). The systematic review, which was published earlier [14], adheres to the PRISMA Statement [18]. The present study adheres to the PRISMA extension for NMA [19]. This article is based on previously conducted studies and does not contain any studies with human participants or animals performed by any of the authors.

Systematic Review

Twelve electronic databases were systematically searched from inception to August 2018 by 2 independent reviewers (A.B. and C.D.). Search terms included 2 synonym groups, AGW and RCT, with adjustments for each database (Appendix S1). The reference lists of all published studies and all recent reviews and meta-analyses were also searched [8,9,10, 13, 20,21,22]. No language restriction was imposed. To be included in the NMA, RCTs had to: (1) have at least 1 treatment group composed of immunocompetent adults clinically diagnosed with AGWs and treated with a ProT (TCA, podophyllin, CO2 laser, cryotherapy, surgical excision, electrosurgery, all intralesional treatments, KOH, PDT, citric acid) or a PaT [podophyllotoxin, imiquimod, sinecatechins, 5-FU, cidofovir, interferon (INF) cream]; and (2) provide original estimates with risk ratios and confidence intervals (CIs) or present sufficient data to allow calculation of these estimates. Complete lesion response (CLR) at the end of follow-up was assessed based on two outcome measures: clearance at 3 months and recurrence 3 months later.

An extraction grid was developed after collegial discussion. For all selected studies, variables of interest were extracted independently by 2 independent reviewers (A.B. and C.D.). These reviewers assessed the risk of bias in the selected RCTs using the Cochrane Collaboration Risk of Bias tool [23]. When different RCTs involved the same patient cohort, the RCT with the longest follow-up period was considered.

Data Synthesis

An NMA was performed that combined the results of all selected comparisons of AGW treatments. This statistical technique is used to account for direct comparisons performed in single trials and to make indirect comparisons across trials based on a common comparator intervention [24]. In our NMA, placebo and podophyllin served as comparators for indirect comparisons even though they are not used in clinical practice. For RCTs comparing treatments at lower or higher dosages than recommended in published guidelines, only recommended dosages were considered. All analyses were performed with a frequentist approach using a random effects model, with an equal heterogeneity variance assumed for all comparisons.

The network geometry was assessed by graphically examining the connections between interventions [17]. Each node represented an intervention. The thickness of nodes was proportional to the number of allocated patients. The thickness of connecting lines was inversely proportional to the variance between 2 interventions.

Netmeta R package version 8.0 (available at: https://CRAN.R-project.org/package=netmeta) was used to perform head-to-head comparisons of different treatments to a placebo [25]. Specifically, 2 forest plots using random effects models were generated by calculating point estimates of relative risk (RR) with a CI 95%. A heat mapping function (which is a type of matrix visualization) was created with the Netmeta R package to evaluate heterogeneity and inconsistency [26]. Warmer or cooler colors indicated significant inconsistency.

The patient was the unit of analysis for all RCTs. The endpoint—CLR after clearance and recurrence assessment—was evaluated using per protocol analysis (cured patients/follow-up patients). Sensitivity analyses of 2 scenarios were performed: (1) a worst-case (intention to treat) scenario, in which patients lost to follow-up were considered to be failing treatment (cured patients/all included patients); and (2) a best-case scenario, in which patients lost to follow-up were considered cured [(cured patients + lost to follow-up patients)/all included patients].

The probability that each intervention achieved CLR was estimated based on the relative effect sizes estimated with the NMA. A hierarchy of compared interventions was performed using the Surface Under the Cumulative Ranking curve (SUCRA). SUCRA values are expressed as percentages and show the relative probability of an intervention being among the best options.

Results

Characteristics of Selected Trials

Seventy RCTs involving 9931 patients with a mean of 142 participants per study fulfilled the inclusion criteria [14] (Appendix S2–S3). The overwhelming majority of included RCTs (66/70) were found to be of poor quality (Appendix S4) [14]. Twenty-one RCTs were excluded from the NMA: 6 because they compared dosages that were lower than recommended [27,28,29,30,31,32], 14 because they did not evaluate recurrence [33,34,35,36,37,38,39,40,41,42,43,44,45,46], and 1 because it was disconnected from the network (intralesional bleomycin vs podophyllin + cryotherapy) [47].

Nine studies comparing a recommended dosage with a lower dosage were included, but without the treatment arm that received the lower dosage [48,49,50,51,52,53,54,55,56]. Ultimately, 29 treatments or combination therapies were included. One RCT compared 4 arms [57], 5 RCTs compared 3 arms [58,59,60,61,62], and 43 compared 2 arms [48,49,50,51,52,53,54,55,56, 63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96]. Following these inclusion criteria, only two of 4 low risk of bias RCTs were included [27, 45]. The median follow-up for the 6006 covered patients was 6 months (3–12 month range).

Network Geometry

The complex network generated from the 49 included RCTs is shown in Fig. 1. Compared treatments were connected either directly or indirectly through 1 or more “comparators.” The level of evidence informing each comparison was evaluated. Treatment comparisons involving the largest number of patients were polyphenon vs placebo (3 trials; 767 patients receiving treatment) and podophyllin vs podophyllotoxin gel (6 trials; 1005 patients receiving treatment). Only 12 RCTs [50, 57, 58, 62, 77, 78, 81, 83, 84, 87, 90, 96] directly compared a ProT to a PaT; of these, 9 examined treatments that are not used in clinical practice (6 on podophyllin and 3 on intralesional therapies). The most commonly studied agents were placebo (18 trials; 939 patients receiving treatment) and podophyllin (13 trials; 716 patients receiving treatment).

Fig. 1
figure 1

Evidence network of eligible comparisons for complete lesion response in network meta-analysis. The thickness of connecting lines represents the cumulative number of trials for each comparison, and the thickness of nodes is proportional to the number of enrolled participants. Cryo cryotherapy, ablative ablative treatment (surgery or electrosurgery or CO2 laser or cryotherapy), imi imiquimod 5%, 5-FU 5 fluorouracil, 5-FU intra intralesional 5 fluorouracil, TCA trichloroacetic acid, podo podophyllin 20–25%, citric ac citric acid 9%, polyph polyphenon 15%, podotox cr podophyllotoxin 0.5% cream, podotox cr/gel podophyllotoxin 0.5% gel + cream, podotox gel podophyllotoxin 0.5% gel, PDT photodynamic therapy, mycobac intra intralesional Mycobacterium, KOH potassium hydroxide, electro electrosurgery, INF-1a intra intralesional interferon-1α, INF-2b intra intralesional interferon-2β

Fig. 2
figure 2

Forest plot of the estimates of relative risk between each treatment and the reference placebo for complete lesion response. Data presented as RR (95% CI). Cryo cryotherapy, ablative ablative treatment (surgery or electrosurgery or CO2 laser or cryotherapy), imi imiquimod 5%, 5-FU 5 fluorouracil, 5-FU intra intralesional 5 fluorouracil, TCA trichloroacetic acid, podo podophyllin 20–25%, citric ac citric acid 9%, polyph polyphenon 15%, podotox cr podophyllotoxin 0.5% cream, podotox cr/gel podophyllotoxin 0.5% gel + cream, podotox gel podophyllotoxin 0.5% gel, PDT photodynamic therapy, mycobac intra intralesional Mycobacterium, KOH potassium hydroxide, electro electrosurgery, INF-1a intra intralesional interferon-1α, INF-2b intra intralesional interferon-2β

Complete Lesion Response

Figure 2 presents the CLR of all treatments and placebos compared using a random effects model. Most CIs were wide, but rarely included value 1. Cidofovir, citric acid 9%, intralesional INF, intralesional placebo, and polyphenon 15% achieved a CLR not significantly different from placebo. Surgery (RR 10.54; CI 95% 4.53–24.52), ablative therapy + imiquimod (RR 7.52; CI 95% 3.63–15.57), and electrosurgery (RR 7.10; CI 95% 3.47–14.53) achieved the best CLR compared to placebo. Other comparisons to placebo had RRs that ranged from 3.84 to 6.75

Head-to-head comparisons using NMA are shown in the online supplement (Appendix S5). Surgery was more efficacious than imiquimod (RR 2.22; CI 95% 1.04–4.76), TCA (RR 2.28; CI 95% 1.09–4.75), KOH (RR 2.48; CI 95% 1.02–6.01), cryotherapy (RR 2.43; CI 95% 1.17–5.03), 5-FU (RR 2.44; CI 95% 1.07–5.56), and polyphenon (RR 7.07, CI 95% 2.82–17.72). No significant differences were found between surgery and other ablative therapies (electrosurgery, CO2 laser), or between surgery and podophyllotoxin 0.5% solution or 0.5% cream. As regards direct comparisons (except for those involving a placebo or podophyllin), the only significant result was the superiority of CO2 laser over cryotherapy (RR 2.40; CI 95% 1.29–4.46).

Examined RCTs presented both heterogeneity and inconsistency. The Netmeta R package provided an I2 value of 60% from a Q statistic for the overall network of 70.7, which had a chi-square distribution with 28 degrees of freedom and yielded a p-value of 0.0001 [25]. The Q statistic was further decomposed into heterogeneity and inconsistency components, valued at 14.7 and 56.0, respectively.

As shown in the net heat plot in Fig. 3, a high inconsistency among mapping functions was found for RCTs comparing the following treatments: cryotherapy vs podophyllin 20–25% vs electrosurgery; 5-FU vs podophyllin; 5-FU vs CO2 laser vs 5-FU + CO2 laser; and CO2 laser vs cryotherapy. Treatments examined in a single study were not evaluated.

Fig. 3
figure 3

Net heat plot. Assessment of consistency between direct and indirect evidence. Horizontal: detached comparisons; vertical: comparisons observed in the network; warm color in the net heat plot indicates that significant inconsistency may arise from a specific comparison and this trend is illustrated by the intensity of the color; gray color: contribution of each direct comparison to the network estimates

Table 1 presents the SUCRA results that emerged from these data. These results confirm that ablative therapy, surgery (90.9%), and electrosurgery (77.1%) are the most efficacious treatments for AGWs. The SUCRA value of combination therapies was also good (PDT + CO2 laser: 68.0%; CO2 laser + 5-FU: 67.4%; Cryotherapy + podophyllotoxin 0.5% cream: 59.5%). Podophyllotoxin 0.5% solution (63.5%) and podophyllotoxin 0.5% cream (62.2%) had the highest SUCRA values of all PaTs. The SUCRA values of imiquimod, TCA, KOH, cryotherapy, and 5-FU ranged from 40 to 50%. The SUCRA value for polyphenon 15% was low at 13.1%.

Sensitivity Analyses

Only polyphenon and podophyllin + TCA had a CI that included value 1 (Appendix S6).

Worst-case (intention to treat) scenario sensitivity analyses showed a superiority of surgery over podophyllotoxin 0.5% solution (RR 1.94; CI 95% 1.00–3.76), CO2 laser (RR 2.20; CI 95% 1.05–4.60), electrosurgery (RR 2.28; CI 95% 1.10–4.74), and cryotherapy + podophyllotoxin 0.5% cream (RR 2.68; CI 95% 1.18–6.07). Ablative therapy + imiquimod was superior to imiquimod alone (RR 1.57; CI 95% 1.01–2.44) and to cryotherapy (RR 1.74; CI 95% 1.05–2.89). A superiority of podophyllotoxin 0.5% cream and podophyllotoxin 0.5% solution over cryotherapy was also found (RR 1.66; CI 95% 1.04–2.66 and RR 1.52; CI 95% 1.06–2.18, respectively) (Appendix S7).

Sensitivity analyses of SUCRA values confirmed the superiority of surgery and combination therapies. Worst-case scenario sensitivity analyses showed an increase in the efficacy of podophyllotoxin 0.5% cream and 0.5% solution (72.2% and 77.7%, respectively), as well as a decrease in the efficacy of electrosurgery due to the high number of patients lost to follow-up in the study by Stone et al. [61] (Appendix S8).

Discussion

In our NMA, ProTs — mainly surgery and electrosurgery — achieved the best CLR, with a median follow-up of 6 months. These results differ from our pooled analysis, which found higher clearance for ProTs but lower recurrence at 12 months for PaTs [97]. Few RCTs have used CLR as a study endpoint. This is unfortunate given that CLR, which assesses clearance until no recurrence, is more meaningful for patients undergoing treatment for AGWs. Combined with the more robust statistical methods of NMA, this endpoint yields more accurate results than pooled analyses. Cidofovir was ranked 4th in our SUCRA analysis. Yet, it is difficult to conclude on the efficacy of this treatment, as the only RCT on the topic found no significant difference with placebo use.

Our results are in line with the NMA of Barton et al. [13], which concluded that ablative techniques were superior. However, unlike us, Barton et al. recommended CO2 laser as first-line treatment. This difference may be explained by the fact that their NMA was restricted to 39 RCTs, included immunocompromised patients and only one RCT using CO2 laser [64], whereas our NMA compared 49 RCTs, focused on non-immunocompromised adults and compared 5 RCTs using CO2 laser [59, 64, 69, 70, 92]. Moreover, Barton et al. found that podophyllotoxin 0.5% solution was the most cost-effective therapeutic solution, followed by CO2 laser. In our NMA, podophyllotoxin 0.5% solution achieved the best CLR among all PaTs.

Unlike systematic reviews on AGW management [10, 13, 97], our NMA examined the efficacy of combination therapies, including ablative therapy + imiquimod, cryotherapy + podophyllotoxin 0.5% cream, and CO2 laser + 5-FU. However, many combination therapies are missing from our NMA, including those most commonly recommended and used in practice: cryotherapy + imiquimod and cryotherapy + podophyllotoxin 0.5% solution. Combination therapies should be given greater consideration and should be adapted as best as possible to individual patients.

Our search was limited by restrictions on access to Chinese databases, especially regarding treatments like PDT. While our NMA results suggest that this treatment is highly efficacious, they are based on only 1 RCT (note that numerous non-randomized studies on MEDLINE have yielded the same finding [98, 99]). Other RCTs on PDT have likely been performed, but they remain inaccessible to the scientific community.

The management of AGWs is heterogeneous in terms of: the type of treatment used; the level of physician experience (for ProTs); the level of patient compliance (for PaTs); the clinical type of AGWs (papillary, flat, or pedunculated); the location of AGWs; the number of AGWs; and the sex of the patient [100, 101]. Such heterogeneity renders more difficult the establishment of a clinically meaningful hierarchy of treatments. In our systematic review, more than 90% of RCTs were found to have a high risk of bias [14], thus casting doubt on the validity of published recommendations. NMAs do not increase the level of evidence of risks of bias, as they remain dependent on the methodology of each RCT. But they do increase statistical power because they encompass all patients included in examined RCTs. Moreover, NMAs can be used to compare treatments that have never been compared before, to identify gaps in knowledge, and to help develop clinically meaningful hierarchies of treatments [24].

Only 2 RCTs in our NMA compared a recommended ProT to a recommended PaT (imiquimod and cryotherapy in both cases) [58, 90]. Future RCTs should compare recommended ProTs and PaTs—for instance, cryotherapy vs podophyllotoxin cream or solution; surgery vs imiquimod; surgery vs podophyllotoxin cream or solution; CO2 laser vs imiquimod; and CO2 laser vs podophyllotoxin cream or solution. Moreover, combination therapies should be more thoroughly assessed to help increase the efficacy of AGW management, and to make it better adapted to the number, type, and location of AGWs. New treatments (KOH, PDT alone or as an adjuvant) should also be evaluated further. Although 5-FU was not mentioned in guidelines until 2019 [10], it could be proposed as a second-line treatment in the future.

Our systematic review and our NMA should be updated regularly. Side effects should be assessed to help physicians personalize treatments for their individual patients. Lastly, study endpoints and ProT use practices (e.g., standardization of freezing or surgical procedures) should be homogenized to allow better comparison of RCTs.

Conclusions

To conclude, in our NMA, surgery and electrosurgery achieved the best CLR, and podophyllotoxin 0.5% was the most efficacious patient-administered treatment. Combined therapies should be evaluated more in future RCTs in view of their identified effectiveness. The results of future RCTs should systematically include clinical type, number and location of AGWs, and sex of the patient, to refine therapeutic indications.

Table 1 Probabilities of treatment ranking