Using Matching-Adjusted Indirect Comparisons and Network Meta-analyses to Compare Efficacy of Brexanolone Injection with Selective Serotonin Reuptake Inhibitors for Treating Postpartum Depression
Brexanolone injection, the first therapy approved by the US FDA for the treatment of postpartum depression (PPD) in adults, has been shown to produce a significantly greater decrease in the Hamilton Rating Scale for Depression (HAM-D) total score than placebo in randomised controlled trials (RCTs) of women with PPD.
Given the rapid effect of brexanolone injection (within 60 h) sustained throughout the length of the trials (30 days), we sought to compare its efficacy data against selective serotonin reuptake inhibitors (SSRIs), the class of antidepressants most commonly prescribed for PPD, using HAM-D and Edinburgh Postnatal Depression Scale (EPDS) outcomes from currently available RCTs.
We extracted data from 26 studies identified in a systematic literature review of pharmacological and pharmacological/nonpharmacological combination therapies in PPD. Six studies were suitable to form evidence networks through which to perform indirect treatment comparisons (ITCs) of HAM-D and EPDS outcomes between brexanolone and SSRIs. Having assessed the comparability and suitability of the available evidence for analysis, we discovered significant heterogeneity in the study designs, most notably in the placebo arms of the trials. We therefore conducted matching-adjusted indirect comparisons (MAICs) between brexanolone and the placebo arms of comparator studies, subsequently using the MAIC results of brexanolone versus placebo, and results for SSRIs versus placebo, to form ITCs of brexanolone versus SSRIs at three separate time points—day 3, week 4 and last observation. ITCs were calculated as the differences in change from baseline (CFB) in HAM-D and, separately, CFB in EPDS, between treatments, and reported with 95% confidence intervals (CIs).
For all time points, MAICs showed larger differences in CFB for brexanolone compared with SSRIs. Differences (95% CIs) between brexanolone and SSRIs were 12.79 (8.04–17.53) [day 3], 5.87 (− 1.62 to 13.37) [week 4] and 0.97 (− 6.35 to 8.30) [last observation] for the HAM-D. For the EPDS, the differences in CFB were 7.98 (5.32–10.64) [day 3], 6.35 (3.13–9.57) [week 4] and 4.05 (0.79–7.31) [last observation]. Other analytical approaches are also presented to demonstrate the similarity of results, using a network meta-analysis approach, and the importance of using the MAIC method to control for the important heterogeneity between placebo arms.
Acknowledging the limitations of ITCs and this evidence base, when compared with SSRIs, these analyses suggest that brexanolone demonstrated larger differences in CFB for both patient- and clinician-reported PPD outcomes and at all investigated time points after adjusting for differences between placebos in the included studies.
Given the differences in study design, matching-adjusted indirect comparisons (MAICs) are needed to compare brexanolone with selective serotonin reuptake inhibitors (SSRIs), as a simple network meta-analysis can give misleading results; key differences include study length and lack of comparability of placebo across studies.
Brexanolone demonstrated a more rapid improvement in postpartum depression compared with SSRIs, as illustrated by the MAIC results at day 3 and week 4.
Brexanolone demonstrated larger differences in change from baseline for both patient- and clinician-reported scales.
Postpartum depression (PPD) is a debilitating mood disorder and one of the most commonly occurring complications in pregnancy and childbirth [1, 2]. The prevalence of PPD across the United States (US) is estimated to be an average of 11.5% (8–20% across states) [3, 4, 5], with global estimates of 17.7% (3–38% across 56 countries) . The Diagnostic and Statistical Manual of Mental Disorders, 5th Edition, defines PPD as a major depressive episode with onset of symptoms occurring during pregnancy or in the 4 weeks following delivery , while the American College of Obstetricians and Gynecologists considers PPD to be depression occurring up to 12 months post-delivery . Women with PPD often present with low mood, anxiety, and feelings of depersonalisation and loss of identity [1, 7]. PPD is also associated with an increased risk of substance abuse  and, left untreated, can lead to suicidality , with suicide itself being a leading cause of maternal mortality [11, 12, 13, 14]. Depressive symptoms can persist for up to 11 years following childbirth , with up to 30% of women with PPD meeting the criteria of depression during and after the first maternal year . The negative impact of PPD can extend to families, and may lead to partner stress  and depression , impaired mother–infant bonding , problems breastfeeding [20, 21], and isolation from family and friends . Children of mothers with untreated symptoms of PPD are at greater risk of impaired and/or delayed long-term physical, cognitive and psychological outcomes [15, 22, 23, 24, 25, 26, 27, 28, 29, 30].
Treatments used to manage PPD include antidepressants, adjunctive to, or in place of, psychotherapy [31, 32, 33]. Most studies of pharmacological treatments in PPD focus on selective serotonin reuptake inhibitors (SSRIs), a class of antidepressants often used in the US during the postpartum period despite not being indicated specifically for PPD [31, 34]. Current pharmacological treatments, including SSRIs, are associated with a slow rate of improvement (6–12 weeks) , and there are a lack of reliable data to support the onset of efficacy occurring within the first week [36, 37]. A Cochrane systematic review of antidepressant treatment in postnatal depression reported that the evidence to support the use of SSRIs in PPD is limited; the few studies published on antidepressants in PPD are generally of low quality, with small sample sizes and risk of bias due to high patient dropout and selective reporting . Obstacles to optimal use of antidepressants include patients receiving subtherapeutic dosing , poor compliance , and inadequate follow-up due to lack of provider time [40, 41].
A soluble, β-cyclodextrin-based form of the neuroactive steroid allopregnanolone, brexanolone injection (ZULRESSO™) is the first therapy to be approved by the US Food and Drug Administration for the treatment of PPD in adults. In a set of studies in adult women with PPD (Study A: NCT02614547; Study B: NCT02942004; Study C: NCT02942017) [42, 43], brexanolone was associated with a greater change from baseline (CFB) than placebo in the primary outcome measure, the Hamilton Rating Scale for Depression (HAM-D) total score (60-h p value < 0.0001; 30-day p value = 0.0213). A greater CFB for brexanolone was also observed with the Edinburgh Postnatal Depression Scale (EPDS) compared with placebo. However, the EPDS is self-rated and is consequently a secondary outcome measure; therefore, these studies were not powered to show statistically significant differences between brexanolone and placebo on this measure of efficacy (60-h p value = 0.1244; 30-day p value = 0.2515; data on file). The studies showed symptom reduction at 24 h and continued response at 4 weeks.
Given the benefits of brexanolone in PPD demonstrated in these studies, in particular the rapid response sustained throughout the study period, it is important to assess the comparative effectiveness of brexanolone against the current standard of care; such an assessment can help doctors, patients, and formulary decision makers understand the relative efficacy of these treatments. As no head-to-head randomised controlled trial (RCT) comparisons exist, indirect treatment comparison (ITC) approaches are required , involving networks of studies linked by one or more common comparators. The suitability of such approaches for use in health technology assessment is well documented [45, 46, 47].
We performed a set of ITCs between brexanolone and comparator treatments in PPD, using available RCT data. For this analysis, the HAM-D was selected as an outcome as it is often recognised as the gold-standard outcome measure in studies of depression [48, 49, 50, 51], while the EPDS was chosen as it is more commonly used to screen for PPD in clinical practice and to measure symptom burden [32, 49]. Although the EPDS is self-rated, this measure has been shown to be significantly correlated with HAM-D scores within the three brexanolone trials; comparison of treatments based on EPDS is provided as a relevant supporting analysis . In this study, we describe (1) assessing the suitability of identified evidence for comparative efficacy methods, and (2) developing study networks to perform ITCs in order to generate estimates of comparative effectiveness. We discuss our investigation into the representativeness of the placebo arms of the three brexanolone trials—which have been pooled and are referred to collectively as 547-PPD-202 in this study—and the subsequent development of an appropriate strategy based on matching-adjusted indirect comparison (MAIC) methodology to perform more appropriate comparisons .
2.1 Clinical Trials of Brexanolone in Postpartum Depression (PPD)
Brexanolone was investigated in three multicentre, randomised, double-blind, parallel-group, placebo-controlled studies: one phase II trial (NCT02614547; referred to as Study A) and two phase III trials (NCT02942004 and NCT02942017; referred to as Studies B and C, respectively) [42, 43]. In Study B, patients were randomised in a 1:1:1 ratio to brexanolone 90 μg/kg/h (BRX90), brexanolone 60 μg/kg/h (BRX60) or placebo , while Studies A and C involved patients randomised in a 1:1 ratio to BRX90 or placebo [42, 43]. Patients in Studies A and B had severe PPD, defined as a HAM-D total score ≥ 26 [42, 43], whereas patients in Study C had moderate PPD (HAM-D score ≥ 20 but ≤ 25) . In all three trials, brexanolone was administered continuously through intravenous infusion, with the dose titrated up to and down from the maximum dose (beginning and end dose, 30 μg/kg/h) [42, 43]. An umbrella protocol facilitated a preplanned analysis of multiple measures of depressive symptoms in an integrated dataset of all three trials .
Of note, the placebo arm in 547-PPD-202 differed from traditional placebo. Patients in this arm received medically supervised care during the 60-h infusion, as well as regular observation across the study period, whereas traditional, oral pharmacological therapy placebo arms more closely resemble the level of care expected in outpatient clinical setting (outside of hospital level of care).
2.2 Identification of Studies
2.3 Assessment of Evidence and Development of Strategy
To plan the ITCs, in which all trials included were considered in one cohesive analysis, we assessed the comparability of the studies. The time points used in 547-PPD-202 were considerably different to those used in the other studies; in 547-PPD-202, the EPDS and HAM-D were both measured at 60 h, with follow-up time points through to 30 days, whereas studies of other treatments in the evidence base reported later time points due to longer treatment periods (first assessment at approximately 6–12 weeks). It was therefore considered inappropriate to only compare primary endpoints in the analyses. Time points for CFB were chosen as 60 h (day 3), 30 days (week 4; last time point in 547-PPD-202), and last observation (range across SSRIs studies: 4 weeks–6 months). Missing outcomes were estimated from existing data; for studies where first observation was later than day 3 or week 4, linear interpolation methods were used to impute the missing values.
2.4 Indirect Treatment Comparison (ITC) Methods
2.4.1 Bucher ITC
2.4.2 Network Meta-analysis
Standard network meta-analysis (NMA) techniques were used in which aggregate relative efficacy estimates for each study were incorporated into one consistent analysis under the assumptions of exchangeability . Using the relative effects between the arms of each study (and associated variance), ITCs were made between BRX90 and each of the treatments in the network while preserving randomisation. As the networks were relatively small, frequentist NMAs were conducted. All models were fitted and run using R software , using the ‘netmeta’ package .
2.4.3 Data Availability
However, the within-patient correlation was not available for the studies identified in the SLR. To estimate the variance of the CFB, given that all information was available for 547-PPD-202, the observed variances for CFB were pooled at each time point; these were used as proxies to impute the variance for all studies.
2.4.4 Matching-Adjusted Indirect Comparison
Due to the notable differences between the 547-PPD-202 placebo arm and other studies, the 547-PPD-202 placebo arm was considered inappropriate for forming indirect comparisons. The BRX90 arm was thus treated as a single-arm study, meaning there was no longer a common comparator to connect the BRX90 arm into the evidence network. MAIC methods can allow treatments to be compared indirectly in the absence of a common comparator . In this study, MAICs were conducted between BRX90 and the placebo arm of a chosen comparator study to provide estimated relative effects for BRX90 versus placebo. As matching to placebo was to be performed to facilitate a link into the centre of the network, the evidence base was assessed to choose an appropriate comparator study for the MAICs. For the EPDS-based MAIC, the control arm from the study by Sharp et al.  was chosen as although it consisted of listening visits, its large sample size and five variables to match against were a clear advantage. The other option (Appleby et al. ) had a smaller sample size and fewer variables to match with, and also included counselling sessions as part of the placebo arm. For the HAM-D-based MAIC, the study by Yonkers et al.  was chosen as the comparator study, as while both that study and the study by Zlotnick  had larger sample sizes than the other studies and the same number of variables for matching, the control arm in the Zlotnick study involved clinical management and mothercrafting, making the placebo arm in the study by Yonkers et al. more suitable.
After choosing suitable comparator studies, matching was performed between BRX90 and each chosen control arm by assigning statistical weights to the individual BRX90-treated patients to adjust for their over-/under-representation relative to the comparator source; this was possible due to the availability of 547-PPD-202 patient-level data. In the study by Sharp et al. , the patient characteristics used for matching were age of the mother (average and SD), race, antidepressant use at baseline, time since delivery (in weeks) at the start of treatment (average and SD), and whether the mother was primiparous; and in the study by Yonkers et al. , the patient characteristics used for matching were age of the mother (average and SD), race, and antidepressant use at baseline. Given the limited data availability, matching was performed on as many variables as possible.
Following weighting, average baseline characteristics (mean and variance) were balanced between the BRX90 and comparator arms. Weights were derived using the propensity score weighting approach proposed by Signorovitch et al. [88, 89]. The propensity score logistic regression model estimated the odds of being enrolled into the BRX90 or comparator arms. The approach uses a method of moments to allow a propensity score logistic regression model to be estimated without comparator study patient-level data. The model was estimated based on the patient-level data available for BRX90-treated patients and the published summary data available for comparator-treated patients.
CFB outcomes for brexanolone were weighted to give weighted mean CFBs, which were incorporated into the ITCs using EPDS and HAM-D data. As inputting into ITCs required associated standard error (SE) values, bootstrap estimators were used to generate SE values and confidence intervals (CIs) for BRX90-weighted mean CFB outcomes to account for the within-subject correlation [90, 91]. First, patients in the BRX90 arm were sampled with replacement to produce a number of bootstrap datasets. For each bootstrap dataset, a set of weights was then derived using the Signorovitch methodology, and, finally, a weighted mean CFB was calculated. This procedure was repeated 1000 times to obtain a distribution of means, and the 2.5th and 97.5th percentiles were used to generate limits of each CI and an SE.
3.1 Identification of Evidence in PPD
Following data extraction and the exclusion of studies (Sect. 2.2), the final evidence base comprised six studies, i.e. Appleby et al. , Sharp et al. , Misri et al. , Hantsoo et al. , Yonkers et al. , and Zlotnick . The clinical heterogeneity of the evidence base was assessed by comparing baseline patient and study characteristics (see electronic supplementary Table 2). While some aspects differed across the trials, in most cases the differences were deemed insufficiently clear or not reported well enough to lead to clear biases in outcomes. We therefore did not make additional adjustments or perform sensitivity analyses based on baseline characteristics.
3.2 ITC Strategy
conduct a set of ITCs using the relative treatment effects from 547-PPD-202;
conduct a set of analyses in which the 547-PPD-202 placebo data are discarded, effectively treating 547-PPD-202 as a BRX90 single-arm trial.
This latter approach involved using the available 547-PPD-202 patient-level data to perform MAICs between BRX90 and the placebo arms of the chosen comparator studies, giving estimated relative efficacy outcomes. The weighted mean BRX90 outcomes were then linked into evidence networks as part of the ITCs.
We present the results of unadjusted ITCs (i.e. Bucher ITCs and NMAs conducted without MAIC) and adjusted ITCs (i.e. Bucher ITCs and NMAs using the MAIC-estimated relative treatment effects of BRX90, to compensate for treating 547-PPD-202 as a single-arm trial) between BRX90 and SSRIs. A diagram of the overall strategy is given in Fig. 4. The two studies used in matching, i.e. Sharp et al.  (EPDS) and Yonkers et al.  (HAM-D), were used in Bucher ITCs to evaluate the effect of matching versus including the 547-PPD-202 placebo. Comparisons with combination therapies and placebo are presented in the electronic supplementary material.
3.3 Unadjusted and Adjusted ITC Analyses
In both the MAIC-adjusted Bucher ITCs and NMAs for day 3, the CFB in the HAM-D and EPDS for BRX90 was greater than that for SSRIs (HAM-D: 12.79, 95% CI 8.04–17.53 [Bucher ITC] and 13.47, 95% CI 9.84–17.10 [NMA]; EPDS: 7.98, 95% CI 5.32–10.64 [Bucher ITC] and 7.97, 95% CI 5.39–10.54) [NMA]). Based on the CIs, none of which crossed zero, these differences were indicated to be significant. In comparison, unadjusted Bucher ITCs and NMAs for day 3 did not suggest significant differences between BRX90 and SSRIs for the EPDS, whereas for the HAM-D, significance was maintained for the NMA SSRI comparison. For both the MAIC and unadjusted analyses, results for the comparison with SSRIs were generally consistent between the two methods considered (Bucher ITC and NMA).
At week 4, SSRIs were indicated to have significantly smaller reductions in EPDS compared with BRX90, in both the MAIC-adjusted Bucher ITC and NMA (Bucher ITC: 6.35, 95% CI 3.13–9.57; NMA: 6.86, 95% CI 3.75–9.97). Similarly, in the week 4 MAIC-adjusted Bucher ITC for the HAM-D, SSRIs continued to lead to a smaller reduction in HAM-D compared with BRX90, albeit one that was indicated as nonsignificant (5.87, 95% CI − 1.62 to 13.37). However, the MAIC-adjusted NMA for the HAM-D, at the same time point, gave a significantly larger difference for BRX90 compared with SSRIs (8.41, 95% CI 2.94–13.88). The differences in the EPDS and HAM-D CFB between BRX90 and SSRIs had smaller absolute values than for day 3, likely due to the extra time available for SSRI efficacy to increase. In comparison, the unadjusted analyses for week 4 showed no significant differences between BRX90 and SSRIs.
In the last observation time point (EPDS: weeks 4–18; HAM-D: week 4 to 6 months), when SSRIs were assumed to have been able to reach maximum efficacy, the MAIC-adjusted Bucher ITC and NMA indicated that SSRIs still had smaller reductions in the EPDS and HAM-D compared with BRX90 (HAM-D: 0.97, 95% CI − 6.35 to 8.30 [Bucher ITC] and 2.02, 95% CI − 3.18 to 7.22 [NMA]; EPDS: 4.05, 95% CI 0.79–7.31 [Bucher ITC] and 3.76, 95% CI 0.62–6.90 [NMA]). However, at this time point, the magnitude of the differences between BRX90 and SSRIs were smallest and the differences for the HAM-D were no longer significant. Similar to previous time points, the unadjusted analyses at the last observation time point showed nonsignificant differences in CFB between BRX90 and SSRIs.
The results for the MAIC-adjusted analyses demonstrated that at all time points, BRX90 led to a greater CFB than SSRIs, for both EPDS and HAM-D. In contrast, the unadjusted analyses showed little difference between BRX90 and SSRIs. At each time point, the key differences in results lie between the MAIC-adjusted and -unadjusted analyses. The choice of analytical approach is therefore crucial to understanding and accurately reflecting the benefit of BRX90.
As highlighted by Meltzer-Brody et al. , treatments for PPD are urgently needed as there is an increased risk of morbidity and mortality in mothers with PPD . The effects of brexanolone on children and families of patients were not investigated in 547-PPD-202; however, the rapid and sustained symptom reduction is potentially beneficial for patients and families alike. Netsi et al. investigated the long-term effects of PPD within a large observational study of women and their offspring and found that children of women with persistent and severe PPD showed an increased risk for behavioural problems by age 3.5 years, along with lower mathematics grades and higher prevalence of depression during adolescence. Furthermore, mothers with chronic PPD could suffer from depressive symptoms 11 years after childbirth . Other studies have shown an association of maternal depression with increased infanticide and decreased use of preventive paediatric health care [93, 94]. Therefore, the rapid and sustained symptom reduction shown in the 547-PPD-202 study, along with greater efficacy when compared with SSRIs via the ITC, can potentially lead to better treatment and symptom control for mothers with PPD.
In this study, we investigated the current clinical evidence regarding the efficacy and speed of improvement for treatments of PPD, conducting comparative efficacy analysis of brexanolone with SSRIs. In the absence of head-to-head trials, we assessed the comparability of identified RCTs and performed ITCs of treatment efficacy. Overall, the MAIC-adjusted ITC results suggest a greater efficacy of BRX90 compared with SSRIs, for both patient-reported (EPDS) and clinician-reported (HAM-D) outcomes and at all time points measured. In the MAIC-adjusted Bucher ITC and NMA at the last observation time point (EPDS: weeks 4–18; HAM-D: week 4 to 6 months), despite all treatments being assumed to have had the opportunity to reach optimum efficacy, BRX90 still gave larger CFB than SSRIs, albeit with smallest absolute values, compared with treatment differences at earlier time points. Thus, brexanolone efficacy is comparatively rapid and also shows sustained relative efficacy compared with these comparator groups during the study period.
As noted in a recent Cochrane review for antidepressants in PPD, the evidence base for this analysis comprised a few low-quality studies with small sample sizes, many of which were performed approximately a decade ago . The severity of depression at baseline varied between all trials included in these analyses, and it is not known whether and to what extent this affected the ITC results. Notably, the difference between the greatest and lowest baseline score was larger than the minimally important difference (HAM-D: 2.1; EPDS: 2.9; data on file), indicating that choosing a different placebo arm for matching may skew the results. Specifically, the EPDS score at baseline of the pooled 547-PPD-202 BRX90 arm was greater than that of the placebo arm in the study by Sharp et al. , whereas for the placebo arm in the study by Yonkers et al.  and 547-PPD-202, the HAM-D scores were similar at baseline. Given the evidence base, each of these studies was considered best suited for matching. While there is a larger evidence base for major depressive disorder (MDD), we considered its use in this study inappropriate as patient populations of studies in MDD and PPD are not compatible with respect to sex and age range.
The placebo arms of the comparator studies differed in terms of study design. Some placebo arms also included behavioural therapy interventions; for example, the study by Sharp et al.  included listening visits, and the study by Appleby et al.  included one counselling session. To construct study networks for NMAs, the placebo arms of comparator studies were assumed to be equivalent, whereas many of these involved nonpharmacological interventions. Similarly, the trials reporting data for SSRIs were grouped, which assumed equivalent efficacy of SSRIs. As Appleby et al.  reported geometric mean values, these were assumed to be comparable with arithmetic means reported by other studies, to allow treatment comparisons to be made, which may not be appropriate.
The outcomes of two measures were used in the analysis—HAM-D and EPDS. While the EPDS was developed as a screening tool, both in clinical practice and clinical trials, it was used to measure the extent of symptoms of PPD; therefore, this tool may not have captured all relevant aspects of changing PPD symptoms as it was not built for this purpose.
One of the key clinical advantages of brexanolone, i.e. the rapid response seen in patients, leads to issues with performing ITCs against studies of treatments with slower onset of efficacy. 547-PPD-202 had approximately 4 weeks follow-up post-treatment, but its study period of 30 days was shorter than the SSRI studies, which ranged from 4 weeks to 6 months. To impute the EPDS and HAM-D for data missing at earlier time points, linear interpolation methods were used. Extending a linear trend from the data may not represent the true trajectory of comparator treatment outcomes over time. Similarly, the use of last observation carried forward to provide estimates of the missing values for brexanolone at time points later than 30 days assumed that the efficacy of brexanolone was sustained past this final measurement. Furthermore, while the choice of last time point allowed SSRIs to reach their optimal efficacy, the assumption that brexanolone reached optimal efficacy at 30 days prohibited any change after this point. Imputations were also made for studies that did not report the CFB variances, instead using observed data from the 547-PPD-202 study. We cannot know whether these imputations over- or underestimated variances, and we do not expect that these imputations created bias for or against brexanolone.
The use of MAICs was deemed a necessary step in the analyses; however, as with all analytical approaches, the MAIC method has limitations. With the 547-PPD-202 placebo arms discarded, there is no common comparator when conducting ITCs, which would be the most robust method; the National Institute for Health and Care Excellence Decision Support Unit Technical Support Document 18 acknowledges unanchored analyses as more likely to be biased than anchored analyses . The same guidance document highlights that with unanchored comparisons, all treatment effect modifiers or prognostic factors should be considered. However, it is unlikely that all key study characteristics were captured during matching as only a small number of relevant characteristics were available in both 547-PPD-202 and reported comparator studies. Furthermore, the MAIC-adjusted NMA required the assumption that the estimated relative effects of BRX90 were constant across the patient population to which the BRX90 patient-level data were matched and the patient populations of the rest of the network. In addition, the choice of study placebo arm with which to match/adjust brexanolone data is subjective and different choices may lead to different results. Across any choice of study placebo arm to match to, brexanolone, an intravenous drug, is compared with an oral placebo. As these analyses are retrospective, the impact of the different treatment administrations is not anticipated to strongly influence comparisons of efficacy outcomes. However, the impact of oral versus intravenous administration is recognised to influence the decision-making process when considering PPD treatment pathways. This impact is anticipated to be reflected through other measures such as quality of life and cost implications. Similarly, payers and decision makers should also consider safety implications associated with the treatment options.
While there are limitations associated with performing ITCs with the identified studies given the difference in endpoint times, the justification for this is to account for the relatively rapid mode of action of brexanolone, a significant clinical and patient-relevant advantage. It is of note that the imputation methods used produce an SSRI effect at day 3; however, it is uncertain whether this is an actual treatment effect as investigations into whether SSRIs have an effect by 3 days are limited and have led to mixed results [36, 37]. Therefore, the interpolation for day 3 results may be biased in favour of SSRIs. However, we recognise that the assumption the efficacy of brexanolone continues beyond the observed trial period may bias the analyses conducted at this time point, although the direction of this bias is unclear. As outcomes at a longer follow-up time for patients treated with brexanolone were not available, this assumption was required in order to make comparisons with SSRIs that better reflect their long-term time-to-efficacy. It is worth noting that the MAIC-estimated treatment differences between brexanolone and SSRIs at both day 3 and week 4 are deemed clinically relevant.
Where there is only one treatment comparison of interest, which can be informed by two studies and a common comparator, the Bucher approach to ITCs may be favoured because of its transparency and its reliance on fewer assumptions (versus a broad NMA with numerous links between studies). However, NMAs have the flexibility to include more treatments and corresponding studies, although they do rely on additional assumptions of homogeneity of the included studies. Across the MAIC-adjusted and unadjusted Bucher ITCs and NMAs, the results of the BRX90 comparisons with SSRIs were broadly similar, for both the EPDS and HAM-D, which may support the use of the simpler method.
We have demonstrated the use of MAIC techniques to perform a more appropriate comparison between BRX90 and SSRIs in the absence of a suitable placebo arm. The MAICs suggest that BRX90 provides a significant improvement for the HAM-D and EPDS outcomes at day 3 relative to SSRIs. Despite a decrease in the absolute difference between BRX90 and SSRIs over time for the HAM-D and EPDS outcomes, the MAICs still suggest that at later time points, BRX90 generally provides equivalent or improved outcomes compared with SSRIs.
Despite the limitations associated with MAICs, such an approach was needed to avoid the use of the unrepresentative 547-PPD-202 placebo arm, with a significantly greater magnitude of improvement than placebo arms of those studies compared, and a methodology (around-the-clock monitoring for a 3-day period) that is not a viable option in clinical practice. Otherwise, standard NMA methodology in which these data were used would likely be misleading as to the effect of brexanolone. Thus, the ITCs in which MAICs were used are likely to capture the true incremental efficacy of brexanolone. Given the ITC limitations regarding the lack of consistency of trial design, especially the availability of outcome data at the different time points, the analysis shown for week 4 (1 month) could be deemed more robust than the other analyses. It is of clinical relevance to provide some estimates of relative treatment effects, acknowledging limitations, at a very early time point (day 3) and also at a later time point (where maximum effect is assumed to be realised).
The authors wish to thank Jake Horgan of BresMed Health Solutions Ltd for drafting the text, copyediting, proofreading and providing overall editorial assistance as required during the preparation of this manuscript.
Compliance with Ethical Standards
This study and editorial assistance were funded by Sage Therapeutics, Inc. Open Access fees were paid by Sage Therapeutics, Inc.
Conflict of interest
Adi Eldar-Lissai and Paul Hodgkins are employees and shareholders of Sage Therapeutics, Inc., which sponsored this study. Miranda C. Cooper, Hannah S. Kilvert and Neil S. Roskell are full-time employees of BresMed Health Solutions Ltd, who received funding for the design, analysis and reporting of this research.
- 8.American College of Obstetricians and Gynecologists. Postpartum depression. Patient FAQs (FAQ091). 2013. https://www.acog.org/Patients/FAQs/Postpartum-Depression Accessed 1 Apr 2019.
- 22.Eastwood JG, Jalaludin BB, Kemp LA, Phung HN, Barnett BE. Relationship of postnatal depressive symptoms to infant temperament, maternal expectations, social support and other potential risk factors: findings from a large Australian cross-sectional study. BMC Pregnancy Childb. 2012;12:148.CrossRefGoogle Scholar
- 24.Verkuijl NE, Richter L, Norris SA, Stein A, Avan B, Ramchandani PG. Postnatal depressive symptoms and child psychological development at 10 years: a prospective study of longitudinal data from the South African Birth to Twenty cohort. Lancet Psychiatry. 2014;1(6):454–60.CrossRefPubMedPubMedCentralGoogle Scholar
- 28.Koutra K, Chatzi L, Bagkeris M, Vassilaki M, Bitsios P, Kogevinas M. Antenatal and postnatal maternal mental health as determinants of infant neurodevelopment at 18 months of age in a mother–child cohort (Rhea Study) in Crete, Greece. Soc Psychiatry Psychiatr Epidemiol. 2013;48(8):1335–45.CrossRefPubMedPubMedCentralGoogle Scholar
- 31.Sharma V, Sommerdyk C. Are antidepressants effective in the treatment of postpartum depression? A systematic review. Prim Care Companion CNS Disord. 2013;15(6).Google Scholar
- 32.Molyneaux E, Howard LM, McGeown HR, Karia AM, Trevillion K. Antidepressant treatment for postnatal depression. Cochrane Database Syst Rev. 2014;9:CD002018.Google Scholar
- 34.American Psychiatric Association. Practice guideline for the treatment of patients with major depressive disorder: third edition. 2010. https://psychiatryonline.org/pb/assets/raw/sitewide/practice_guidelines/guidelines/mdd.pdf. Accessed 11 Mar 2019.
- 41.Jones LE, Turvey C, Carney-Doebbeling C. Inadequate follow-up care for depression and its impact on antidepressant treatment duration among veterans with and without diabetes mellitus in the Veterans Health Administration. Gen Hosp Psychiatry. 2006;28(6):465–74.CrossRefPubMedPubMedCentralGoogle Scholar
- 45.Dias S, Welton N, Sutton A, Ades A. NICE DSU Technical support document 1: introduction to evidence synthesis for decision making. 2011. http://www.nicedsu.org.uk. Accessed 13 Apr 2018.
- 46.Dias S, Welton N, Sutton A, Ades A. NICE DSU Technical support document 2: a general linear modelling framework for pair-wise and network meta-analysis of randomised controlled trials. 2011 (last updated Sep 2016). http://www.nicedsu.org.uk. Accessed 13 Apr 2018.
- 47.Hoaglin DC, Hawkins N, Jansen JP, Scott DA, Itzler R, Cappelleri JC, et al. Conducting indirect-treatment-comparison and network-meta-analysis studies: report of the ISPOR task force on indirect treatment comparisons good research practices: part 2. Value Health. 2011;14(4):429–37.CrossRefPubMedPubMedCentralGoogle Scholar
- 48.Khan A, Khan SR, Shankles EB, Polissar NL. Relative sensitivity of the Montgomery–Asberg Depression Rating Scale, the Hamilton Depression rating scale and the clinical global impressions rating scale in antidepressant clinical trials. Int Clin Psychopharmacol. 2002;17(6):281–5.CrossRefPubMedPubMedCentralGoogle Scholar
- 52.Gerbasi ME, Meltzer-Brody S, Eldar-Lissai A, Acaster S, Fridman M, Bonthapally V, et al. The association between the Hamilton Rating Scale for Depression and the Edinburgh Postnatal Depression Scale in postpartum depression. American Psychiatric Association annual meeting, 18–22 May 2019, San Francisco, CA.Google Scholar
- 53.Phillippo DM, Ades AE, Dias S, Palmer S, Abrams KR, Welton NJ. NICE DSU Technical support document 18: methods for population-adjusted indirect comparisons in submission to NICE. 2016. http://www.nicedsu.org.uk. Accessed 13 Apr 2018.
- 63.Bloch M, Meiboom H, Lorberblatt M, Bluvstein I, Aharonov I, Schreiber S. The effect of sertraline add-on to brief dynamic psychotherapy for the treatment of postpartum depression: a randomized, double-blind, placebo-controlled study. J Clin Psychiatry. 2012;73(2):235–41.CrossRefPubMedPubMedCentralGoogle Scholar
- 73.Morrell CJ, Warner R, Slade P, Dixon S, Walters S, Paley G, et al. Psychological interventions for postnatal depression: cluster randomised trial and economic evaluation. The PoNDER trial. Health Technol Assess. 2009;13(30):1–153 (iii–iv, xi–xiii).Google Scholar
- 77.Pinheiro RT, Botella L, de Avila Quevedo L, Pinheiro KAT, Jansen K, Osório CM, et al. Maintenance of the effects of cognitive behavioral and relational constructivist psychotherapies in the treatment of women with postpartum depression: a randomized clinical trial. J Constr Psychol. 2014;27(1):59–68.Google Scholar
- 80.Sharp D, Chew-Graham C, Tylee A, Lewis G, Howard L, Anderson I, et al. A pragmatic randomised controlled trial to compare antidepressants with a community-based psychosocial intervention for the treatment of women with postnatal depression: the RESPOND trial. Health Technol Assess. 2010;14(43):1–153.CrossRefGoogle Scholar
- 84.Zlotnick C. Effectiveness of sertraline alone and interpersonal psychotherapy alone in treating women with postpartum depression. ClinicalTrials.gov; 2016. https://clinicaltrials.gov/ct2/show/study/NCT00602355?sect=X4301256. Accessed 7 Nov 2018
- 85.Ahmadpanah M, Nazaribadie M, Aghaei E, Ghaleiha A, Bakhtiari A, Haghighi M, et al. Influence of adjuvant detached mindfulness and stress management training compared to pharmacologic treatment in primiparae with postpartum depression. Arch Womens Ment Health. 2018;21(1):65–73.CrossRefPubMedPubMedCentralGoogle Scholar
- 86.R Core Team R: A language and environment for statistical computing. R Foundation for Statistical Computing. 2017. https://www.R-project.org/. Accessed 27 Mar 2018.
- 87.Rücker G, Schwarzer G, Krahn U, König J. Netmeta: Network meta-analysis using frequentist methods. 2018. https://CRAN.R-project.org/package=netmeta. Accessed 19 Jun 2018.
- 91.Therneau TM, Crowson CS, Atkinson EJ. Adjusted survival curves. 2015. https://cran.r-project.org/web/packages/survival/vignettes/adjcurve.pdf. Accessed 7 Nov 2018.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.