Introduction

Rheumatoid arthritis (RA) is a systemic inflammatory autoimmune disease with heterogeneous disease courses that may be associated with progressive joint destruction, reduced quality of life, and even reduced survival [1, 2]. Management of RA has improved substantially in recent years with improved patient outcomes and clinical remission becoming more achievable. Because persistent joint inflammation can lead to progressive joint destruction and functional impairment, current guidelines recommend clinical remission as the primary treatment goal [3, 4]. Joint damage can occur within months of disease onset, and early aggressive treatment provides increased probability of disease control and minimizes the long-term impact of RA [1, 5].

Conventional synthetic disease-modifying antirheumatic drugs (csDMARDs) such as methotrexate are generally used as initial treatment of RA [3, 4]. However, many patients have active disease despite the use of these agents or are intolerant to them. Hence, there is a need for improved RA treatments.

Baricitinib is an orally administered, selective, reversible inhibitor of Janus kinase (JAK)1 and JAK2 [6], which are endogenous signal transducers of proinflammatory cytokines involved in inflammatory diseases such as RA [7]. In two global, double-blind phase 3 studies in patients with active RA and either inadequate response (IR) to csDMARDs (RA-BUILD) [8] or IR to methotrexate (RA-BEAM) [9], once daily (QD) baricitinib was associated with clinical improvements through 24 weeks, with an acceptable safety profile. Geographic differences in RA presentation and its management may affect patient outcomes. The objective was to perform a subgroup analysis evaluating the efficacy and safety of baricitinib 4 mg compared to placebo in patients from the United States including Puerto Rico (US) and the “rest of the world” (ROW) using pooled data sets to identify possible demographic and clinical characteristics that may contribute to patient response to therapy. Understanding these characteristics can provide additional information for the clinician that is relevant to their patient population and assessment of response. The dose of baricitinib 4 mg was chosen because it was common to both RA-BEAM and RA-BUILD [8, 9].

Methods

Patients

This is a post hoc pooled analysis from RA-BUILD (NCT01721057) [8] and RA-BEAM (NCT01710358) [9]. In both trials, key inclusion criteria included ≥ 6 out of 68 tender joints, ≥ 6 out of 66 swollen joints, and IR to at least one csDMARD (stable background csDMARD permitted). In RA-BUILD, inclusion criteria included high-sensitivity C-reactive protein (hsCRP) ≥ 3.6 mg/L [8] and in RA-BEAM inclusion criteria included hsCRP ≥ 6 mg/L [9]. Additionally, inclusion in RA-BEAM required either ≥ 1 joint erosion in the hand, wrist, or foot with rheumatoid factor (RF) or anti-citrullinated peptide antibody (ACPA) positive status or ≥ 3 joint erosions regardless of RF or ACPA status [9]. Exclusion criteria in both studies included previous biologic DMARD use [8, 9]. Both studies were conducted in accordance with ethical principles of the Declaration of Helsinki and Good Clinical Practice guidelines and approved by ethical review boards for each center. All patients provided written informed consent before enrollment.

Treatment Regimens

In RA-BUILD, patients were randomly assigned 1:1:1 to placebo or orally administered baricitinib (2 or 4 mg QD) and treated for 24 weeks [8]. In RA-BEAM, patients were randomized 3:3:2 to placebo, orally administered baricitinib 4 mg QD, or subcutaneously administered adalimumab 40 mg every other week, respectively, and treated for 52 weeks [9]. In both trials, randomization was stratified by region and baseline joint erosion status, and the primary endpoint was American College of Rheumatology 20% response (ACR20) at week 12 [8, 9]. At week 16 or subsequent visits, inadequate responders received baricitinib 4 mg QD as rescue therapy. Inadequate response was defined as lack of improvement of at least 20% in both tender joint count and swollen joint count at both week 14 and week 16 compared to baseline. The pooled data set presented focuses on baricitinib 4 mg and placebo. Geographically defined subpopulations were US and ROW.

Efficacy Evaluations

The efficacy endpoints were the proportion of patients achieving ACR20, ACR 50% response (ACR50), and ACR 70% response (ACR70) [10], and the proportions of patients achieving low disease activity as measured by a Clinical Disease Activity Index (CDAI) [11, 12] score of ≤ 10, a Simplified Disease Activity Index (SDAI) [12, 13] score of ≤ 11, and clinical remission as measured by an SDAI score of ≤ 3.3 or a CDAI score of ≤ 2.8 [4, 12, 14]. Change from baseline in physical function was assessed by the Health Assessment Questionnaire-Disability Index (HAQ-DI) score [15, 16] and change from baseline in disease activity was assessed by the Disease Activity Score using 28 joint counts and hsCRP (DAS28-hsCRP) [17].

Safety

For the safety analyses, all randomized patients who received at least one dose of the study drug and who did not discontinue from the study for the reason “lost to follow-up” at the first postbaseline visit were included. An overview of adverse events from both trials is reported.

Statistical Analyses

Patient demographics and baseline characteristics, including sample size and percentages by treatment group, are presented using summary statistics. In the current subgroup analyses, comparisons between each baricitinib 4 mg group and placebo group were performed across subgroups (US and ROW) at weeks 12 and 24 on the modified intent-to-treat (mITT) population using summary statistics. The mITT population was defined as all randomized patients who received at least one dose of study drug.

For the categorical outcomes, nonresponder imputation was used in the analysis for patients who received either rescue therapy or discontinued from the study or study treatment. To detect significant interactions between treatment and subgroups at week 12, the following logistic regression model was used for each efficacy endpoint: treatment group + subgroup + treatment-by-subgroup interaction + study.

An interaction p value ≤ 0.10 was considered statistically significant for this analysis. When the sample size requirements were not met (if any of the treatment groups in the US or ROW subgroups had less than 30 patients in a stratum within a subgroup or less than five responders in any level of the factors in the model), the interaction p value was not calculated. Within a subgroup, the odds ratio and 95% CI are from a logistic regression model: treatment group + study. When the aforementioned sample size requirements were not met, the p value from the Cochran–Mantel–Haenszel test was used instead of the odds ratio and 95% CI.

Interpretation of subgroup interaction analyses that had a p value of ≤ 0.10 began with an examination of the direction (same as or opposite to overall treatment effect) followed by the magnitude of the treatment effect across the geographic subgroup.

Results

Baseline Characteristics

The pooled data set consisted of 714 (96 US; 618 ROW) and 716 (92 US; 624 ROW) patients in the baricitinib 4 mg and placebo groups, respectively. The proportions of patients who were White or Black/African-American were higher in US, whereas Asians were more represented in ROW (Table 1).

Table 1 Baseline demographics and clinical characteristics

Patients in the US subgroup were slightly older with higher mean weight and body mass index (BMI), fewer years of RA from time of diagnosis, and a lower percentage was RF/ACPA-positive and/or used corticosteroids. Mean modified total Sharp score (mTSS) and hsCRP were higher in the ROW than the US subgroup. More patients in the US subgroup had previously used only one csDMARD whereas more patients in the ROW subgroup had previously used at least two csDMARDs.

Efficacy

At week 12, the proportions of patients achieving ACR20, ACR50, and ACR70 responses were higher in the baricitinib group compared to the placebo group within both the US and ROW subgroups (Fig. 1a). The odds ratios for multiple efficacy measures for the baricitinib versus placebo comparisons were between 2- to 3-fold among the US subgroup and 3- to 6-fold among the ROW subgroup (Fig. 1b) favoring a positive baricitinib treatment effect compared to placebo. At week 12, the interaction p values were not significant (ACR20, p = 0.852; ACR50, p = 0.424; ACR70, p value not calculated because of small sample size). Similarly, a higher proportion of baricitinib-treated patients compared to placebo-treated patients responded at 24 weeks (Fig. 1c, d).

Fig. 1
figure 1

Efficacy at 12 and 24 weeks: baricitinib 4 mg versus placebo. Percentage (95% CI) of patients achieving ACR20, ACR50, and ACR70 at 12 weeks (a) and 24 weeks (c) using nonresponder imputation. The numbers within the bars are the actual percentages. Odds ratios (95% CI) for the baricitinib versus placebo comparisons at 12 weeks (b) and 24 weeks (d) for ACR20, ACR50, and ACR70. The values within the bars are the odds ratios. ACR20, 20% improvement using American College of Rheumatology criteria; ACR50, 50% improvement using American College of Rheumatology criteria; ACR70, 70% improvement using American College of Rheumatology criteria; BARI, baricitinib 4 mg; CI, confidence interval; N, population size; PBO, placebo; ROW, rest of the world; US, United States including Puerto Rico

At week 12 and week 24, baricitinib-treated patients experienced greater improvements in their DAS28-hsCRP and HAQ-DI scores relative to placebo-treated patients within both the US and ROW subgroups (Fig. 2). At weeks 12 and 24, a greater proportion of baricitinib-treated patients achieved low disease activity (CDAI ≤ 10 and SDAI ≤ 11) and remission (CDAI ≤ 2.8 and SDAI ≤ 3.3) compared to placebo-treated patients in both the US and ROW subgroups (Figs. 3, 4). In summary, the baricitinib treatment effect versus placebo was similar in the US and ROW subgroups for multiple efficacy endpoints.

Fig. 2
figure 2

Change from baseline in DAS28-CRP and HAQ-DI at 12 and 24 weeks: baricitinib 4 mg versus placebo. Least squares mean (95% CI) using mLOCF are shown for DAS28-hsCRP at 12 (a) and 24 weeks (b) and for HAQ-DI at 12 (c) and 24 weeks (d). The numbers within the bars are the actual values. BARI, baricitinib 4 mg; CI, confidence interval; DAS28-CRP, Disease Activity Score 28 joints high sensitivity C-reactive protein; HAQ-DI, Health Assessment Questionnaire-Disability Index; mLOCF, modified last observation carried forward; LSM, least squares mean; N, population size; PBO, placebo; ROW, rest of the world; US, United States including Puerto Rico

Fig. 3
figure 3

CDAI at 12 and 24 weeks: baricitinib 4 mg versus placebo. The percentage (95% CI) of patients achieving low disease activity defined as CDAI ≤ 10 at 12 weeks (a) and 24 weeks (b) and remission defined as CDAI ≤ 2.8 at 12 weeks (c) and 24 weeks (d) using NRI are shown. The numbers within the bars are the actual percentages. BARI, baricitinib 4 mg; CI, confidence interval; CDAI, Clinical Disease Activity Index; N, population size; NRI nonresponder imputation; PBO, placebo; ROW, rest of the world; US, United States including Puerto Rico

Fig. 4
figure 4

SDAI at 12 and 24 weeks: baricitinib 4 mg versus placebo. The percentage (95% CI) of patients achieving low disease activity defined as SDAI ≤ 11 at 12 weeks (a) and 24 weeks (b) and remission defined as SDAI ≤ 3.3 at 12 weeks (c) and 24 weeks (d) using NRI are shown. The numbers within the bars are the actual percentages. BARI, baricitinib 4 mg; CI, confidence interval; N, population size; NRI, nonresponder imputation; PBO, placebo; ROW, rest of the world; SDAI, Simplified Disease Activity Index; US, United States including Puerto Rico

Safety

Full adverse event and safety information was previously reported in the RA-BUILD and RA-BEAM manuscripts [8, 9]. As a result of the small number of patients in the US subgroup, limited analyses were conducted in this subpopulation to evaluate the incidence of adverse events in US compared to the ROW. Safety through 12 and 24 weeks is summarized in Table 2. At both 12 and 24 weeks, there were no notable differences between the US and ROW subgroups. Herpes zoster events were more frequently observed with baricitinib treatment. There were few malignancies and deaths, with no notable differences between groups.

Table 2 Safety

Discussion

In this post hoc, pooled subset analysis of two phase 3 studies in patients with an IR to csDMARDs, baricitinib 4 mg was efficacious compared to placebo in both the US and the ROW subgroups. There were no notable between-subgroup differences in safety.

The geo-epidemiology of RA and other inflammatory autoimmune diseases has been discussed in the literature [18, 19]. The development of RA is complex and multifactorial with the development and severity of RA being linked to both environmental and genetic factors, such as exposure to tobacco smoke and pollutants and/or the presence of specific genetic markers, respectively. At least among women in the United States, there is geographic variation in the incidence of RA even after controlling for confounders, suggesting that regional differences in behavior, climate, environmental exposures, genetic factors, or diagnosis may exist [20].

Despite numerous geo-epidemiological studies, little is known about treatment differences between the US and the ROW and whether geographic subpopulations of RA patients respond differently to specific treatments. Within the racially and ethnically diverse US population, it is unknown whether there are differences in efficacy and safety of therapeutic agents among the various racial and ethnic subgroups. Although it is known that race-related patient preferences and access play a role in the types of RA drugs taken by Black/African-Americans [21, 22], these issues should not affect results in a clinical trial setting. Unfortunately, these studies enrolled a very small number of Black/African-Americans across the two trials (14 placebo; 11 baricitinib) (Table 1), so it is difficult to make meaningful comparisons between this subgroup and other races in this study. Similarly, the sample sizes were too small to perform additional subgroup analyses such as by age, ethnicity, or disease characteristics.

The predominant limitation of this post hoc pooled analysis was the small sample size of the US subgroup (n = 188) compared to the ROW subgroup (n = 1242), as well as differences in the baseline demographics between the US and ROW subpopulations, which included time since RA diagnosis, weight, BMI, proportion of RF/ACPA-positive patients, hsCRP, number of csDMARDs previously used, corticosteroid use, and evidence of damage (mean mTSS), all of which were greater in the ROW subpopulation with the exception of weight and BMI, which were greater in the US. However, baseline disease activity scores were similar in the US and ROW subgroups. It is possible that RA patients in the US may not have progressed as far as patients from some other countries because of earlier treatment and access to care, including earlier use of csDMARDs; this is reflected in the higher percentage of US patients that had previously used only one csDMARD before enrolling in the clinical trials.

Published literature has demonstrated that high BMI/obesity may negatively impact the response to TNF inhibitors [23,24,25] and that the presence of baseline autoantibodies and autoantibody levels have been correlated with clinical responses. In the AMPLE trial, baseline ACPA positivity (versus ACPA negativity) as measured by an anti-CCP2 ELISA was associated with better responses to abatacept or adalimumab; patients with the highest baseline ACPA antibody concentrations had better clinical response to abatacept than patients with lower concentrations, which was not observed with adalimumab [26]. These analyses were not possible here because of the limited sample sizes in this subpopulation analyses.

Additionally, we previously performed an integrated subgroup analysis through 12 weeks for an all-phase csDMARD-IR analysis set to evaluate potential subgroup interactions for baricitinib 4 mg QD versus placebo (data on file). This analysis set includes data from placebo- and active-controlled phase 2 and phase 3 studies that evaluated baricitinib on a background of csDMARD treatment in patients with active RA. Consistent with results in the overall patient samples from these studies, there was no evidence suggesting an absent or unfavorable baricitinib treatment effect (i.e., a qualitative interaction) in any subgroup based on demographic or disease-related characteristics (including age, RF/ACPA- positivity, or disease state at baseline).

Conclusions

Baricitinib 4 mg demonstrated higher clinical responses compared to placebo in RA patients with IR to methotrexate and/or csDMARD within both the US and ROW subpopulations despite some differences in baseline patient characteristics. As more targeted synthetic agents are developed to treat various rheumatic diseases, it is crucial to establish that such therapies maintain their efficacy and safety profile in wide, heterogeneous patient populations.