FormalPara Key Points

We developed a new framework on the basis of both AI and non-AI approaches to ease some of the most time-consuming and labor-intensive steps of signal detection and validation, such as the selection of controls and the identification of co-reported drugs that may serve as alternative causes for the reported adverse events in signals of disproportional reporting.

Future work is needed to further assess the extent of impact of our framework for other drugs and spontaneous reporting databases when performing routine signal detection and validation activities.

1 Introduction

Individual case safety reports (ICSRs) recorded in spontaneous reporting databases have been and continue to be the cornerstone and the most important data source in signal detection. In recent years, regulatory agencies could identify more than 80% of potential safety signals leading to regulatory action using ICSRs [1]. Despite the richness of this data source, the incessantly increased amount of ICSRs recorded in spontaneous reporting databases has generated problems in signal detection and validation due to the increase in resources and time needed to process cases. The number of ICSRs recorded in the US Food and Drug Administration Adverse Event Reporting System (FAERS) in the last 10 years (2011–2021) has increased threefold, and a similar trend is seen in Europe (i.e., EudraVigilance) [2].

Time- and resource-demanding activities related to processing ICSRs include manual procedures to evaluate individual causality with the final goal of dismissing false-positive safety signals. One of the most time- and resource-demanding procedures for dismissing safety signals is the identification of alternative causes for the reported adverse events (AEs) in ICSRs after disproportionality signals have been identified. This case assessment/signal validation procedure is currently normally performed manually by evaluating variables such as the co-reported drugs in ICSRs [3]. Co-reported drugs are screened to identify identifying alternative potential causes for the newly identified drug–event pair. Additionally, co-reported drugs are screened to evaluate whether the disease for which these drugs are used (i.e., the therapeutic indication) may itself represent an alternative cause for the AEs. Naturally, care must be taken, as listed drugs that were used to treat the AE may have been listed by the reporter as well. For the latter, recent research has shown that the appropriate selection of controls in disproportionality analysis can mitigate the need for such activity, especially when drugs used for the same therapeutic indication are chosen as controls [4, 5]. Activities to identify alternative causes or appropriate controls in disproportionality analysis are time- and resource-demanding [3] when performed manually. However, these activities are crucial to dismiss false-positive signals, as shown in the 2020 annual report on EudraVigilance from the European Parliament, in which 80% of all potential safety signals in 2018–2020 ended up being not validated, or rather being false positives [2].

Eminent experts and a representative from pharmaceutical industries and regulatory agencies have highlighted the need to automatize time- and resource-demanding procedures in signal detection and validation [6]. However, to date there is a sparse availability of automatized tools for such purposes.

Therefore, this study aimed at developing automatized procedures for the identification of alternative causes for reported AEs in disproportionality signals originating from spontaneous reporting databases, with the final goal of dismissing false-positive disproportionality signals. For this purpose, FAERS was used as a data source and galcanezumab as a proof-of-concept drug, as this is a drug recently approved for which no studies have been conducted using spontaneous reporting databases.

2 Methods

2.1 Automatic Selection of Controls

In this article, we present a dual approach combining the Anatomical Therapeutic Chemical (ATC) classification system code and the approved therapeutic indication in the US Summary of Product Characteristics (SmPC) of galcanezumab for automatizing the selection of controls for disproportionality analysis when using FAERs. The ATC code is a unique code assigned to all active ingredients and consists of five levels: (1) the anatomical main group, (2) the therapeutic subgroup, (3) the pharmacological subgroup, (4) the chemical subgroup, and (5) the chemical substance [7]. By using the 4th level of the ATC code, or rather the chemical subgroup, we identified all active ingredients with the same therapeutic target (i.e., CGRP antagonists) as galcanezumab. Additional controls were identified by selecting active ingredients outside the chemical subgroup of galcanezumab but with the same approved therapeutic indication to avoid masking due to drug class effect [8] and confounding by indication. Drug class effect masking is a phenomenon that can occur when only drugs within the same chemical subgroup are used as controls in disproportionality analysis [8]. Confounding by indication is a type of bias that can occur when drugs used for different therapeutic indications are compared in disproportionality analysis [9].

To identify controls with the same approved therapeutic indication but outside the chemical subgroup of galcanezumab, DrugBank was used. DrugBank is a comprehensive online database containing information on active pharmaceutical ingredients and their respective pharmacological targets. In DrugBank it is possible to obtain a list of all active ingredients approved for a certain disease by searching a structured field known as “associated condition.” An “associated condition” is a term found in the drug indication section of DrugBank and refers to any specific state or medical condition for which the drug is indicated. By searching this field in DrugBank it is possible to obtain a list of drugs with an approved therapeutic indication for the medical condition under investigation which, in the case of galcanezumab, is “migraine prophylaxis.” To test the reliability of our approach for selecting controls in disproportionality analysis, we computed the proportion of drug classes correctly retrieved among drug classes mentioned in two recently published articles [10, 11]: (number of drug classes retrieved with our approach / number of drug classes mentioned in the two recently published reviews) × 100%. For each active ingredient, drug classes were obtained from the 4th level of the ATC code.

2.2 Data Sources and Management

ICSRs in which galcanezumab and the selected control groups (see Sect. 2.1) were reported as the suspected drugs were extracted from FAERS. FAERS is the spontaneous reporting database maintained by the FDA. The goal of FAERS is to support the FDA’s post-marketing safety surveillance. The database consists of spontaneous reports of AEs and product quality complaints, and medication errors reported by health care professionals and consumers are collected. ICSRs recorded in FAERS use the structure provided by the International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use (ICH) E2B guideline to collect ICSRs. This structure is used in all spontaneous reporting databases from the countries participating in the World Health Organization’s Programme for International Drug Monitoring, which accounts for 99% of countries worldwide [12]. All available zipped ASCII quarterly data extract files from the fourth quartile of 2012 to the third quartile of 2021 from FAERS were downloaded from the OpenFDA website. A local database was set up in R (version 4.1.2, R Development Core Team) by using the primary identifier of each ICSR as the key to linkage based on the relational structure described by Kass-Hout and colleagues [13].

2.3 Statistical Analysis

2.3.1 Comparability of Controls

For descriptive purposes, the comparability of galcanezumab cases and the selected controls were examined by comparing their distribution of sex, age, weight (kg), reporter’s qualification, number of co-reported drugs, and the year of reporting, as these variables have been previously identified as potential sources of confounding in disproportionality analysis [14]. Additionally, as a measure of comparability, the density functions of the propensity score of being in a report having galcanezumab as the suspected drug of these variables was computed for assessing the overlap in density functions among galcanezumab cases and selected controls [15, 16]. The propensity score was calculated using a logistic regression model with the following covariates: sex, age, weight (kg), reporter’s qualification, number of co-reported drugs, and the year of reporting. For the calculation of the propensity score, we replaced the missing values with an index value (e.g., for age we assigned the value 9999) to perform the computation for all cases. The default imputed value for age is convenient for categorization but may be suboptimal when included as a continuous explanatory variable in a model. Therefore, a subsequent sensitivity analysis was performed for the calculation of the propensity score where instead of using 9999, we imputed with the mode age and added another variable for whether age is missing or not.

2.3.2 Dismissal of False-Positive Disproportionality Signals

In the pool of the potential controls (see Sect. 2.1), two drugs within the chemical subgroup of galcanezumab and two drugs outside the chemical subgroup with the same therapeutic indication of galcanezumab were randomly chosen for disproportionality analyses. The reporting odds ratio (ROR) was used as a disproportionality measure and the 95% confidence interval (CI) of all AEs recorded for galcanezumab and the selected controls in FAERS with a frequency of reporting greater or equal to 3 was calculated according to the recommendations described by the European Medicine Agency (EMA) [17]. Of note, in the context of this framework, the ROR is defined as the ratio of spontaneous reports for a given drug that report a specific AE divided by the corresponding ratio of the drug used as a control in disproportionality analysis. Heatmaps of RORs for disproportionality analysis were plotted. Disproportionality signals for which the lower limit of the 95% CI was greater than 1 were further analyzed by using conditional inference trees (R Package: partykit; Version: 1.2–16; function: ctree) [18] to identify alternative cause co-reported drugs.

The use of a conditional inference tree required data management, as information stored in FAERS was not immediately usable for such a purpose. In particular, all the information on co-reported drugs in the ICSRs were stored as binary variables. Before using the conditional inference trees, binary variables were filtered on the basis of variance, and variables having ≥ 95% identical values across cases and controls were removed. The idea of this filter was to get rid of variables that only consist of noise and therefore have very little variation [19]. Pre-filtering has been performed to reduce the size of the dataset and increase speed in computational power while reducing RAM resources needed to store in R the highly dimensional dataset. The pre-processing is automatized and there is no human time involved in this procedure. For each disproportionality signal, conditional inference trees were used to identify disproportionally co-reported drugs among those variables kept after filtering based on variance. The steps for fitting the conditional inference trees were the following:

  1. 1.

    We first filtered all cases with the adverse event of interest;

  2. 2.

    We generated a highly dimensional set of co-reported drugs for cases identified in step 1);

  3. 3.

    We fitted the conditional inference tree using as the class a binary variable named outcome, which included information on the drugs for which we performed the head-to-head comparison and co-reported drugs as predictors. For example, if we compared galcanezumab versus erenumab, the outcome variable will have two levels with labels “galcanezumab” and “erenumab.” The main reason for doing so is that we wanted disproportionally co-reported drugs only for the adverse event under investigation;

  4. 4.

    All the co-reported drugs with a p-value < 0.05 were considered disproportionally co-reported and underwent validation.

The SmPC of disproportionally co-reported drugs was screened to identify those drugs that listed in the SmPC the AE in disproportionality signal mimicking procedures performed during signal validation [3, 20,21,22,23,24,25]. The disproportionality analysis was conducted again by removing cases with co-reported drugs for which the AE under investigation was listed in the SmPC as these cases had alternative causes for the AE. Finally, we calculated the fraction of disproportionality signals dismissed by using this procedure. A disproportionality signal, that is, a statistical alert, was considered dismissed if there was no statistical significance, according to the threshold set by EMA, after dismissing cases with co-reported drugs for which the AE under investigation was listed in the SmPC of the co-reported drugs [17].

A flowchart showing the implementation of automatic selection of controls and dismissal of false positive signals using conditional inference tree is shown in Fig. 1.

Fig. 1
figure 1

Flowchart showing implementation of automatic selection of controls and dismissal of false positive signals using conditional inference tree

2.4 The Rationale for Using Conditional Inference Trees

A conditional inference tree is a method to build up a prediction model via classifying characteristics of cases and controls (i.e., predictors) to the outcome (i.e., the AE). In a decision tree, the characteristics of the cases/controls and outcomes under investigation can be binary: true or false, yes or no.

A conditional inference tree consists of a root, branches, and leaves. The predictor that has a higher degree of association with the outcome represents the root node of the tree. To determine the strength of association of individual predictors and therefore determine the hierarchy of the conditional inference trees, impurity measures are used (e.g., Gini Impurity Index) and statistical tests are used to determine disproportionality (i.e., chi-squared test). Impurity measures assess how good a predictor is in classifying individuals as having/not having the outcome. Each time before adding a new set of predictors for the outcome, the impurity of the whole tree is computed. The smaller impurity the better the classification is. Where to draw the line to divide the numerical value depends on the impurity as well.

In this study, conditional inference trees were used to identify co-reported drugs that were disproportionally reported between cases and controls reporting the AE under investigation. The advantage of using decision trees in this setting relies on the fact that this technique, if compared with the current approach used in disproportionality analysis (i.e., stratification), can investigate preemptive non-linear relationships between predictors and the outcome. The rationale for using conditional inference trees is also epidemiological. If cases and controls have statistically significant differences in the proportion of a co-reported drug, the latter may represent a potential confounder for the investigated signals if we subset cases and controls with the AE under investigation.

2.5 Framework Validation Using Simulated Data

The objective of the simulation was to show some generalizability of the conditional inference trees for the identification of confounders among co-reported drugs in disproportionality signals. The method was able, in three different scenarios, to correctly identify the disproportionally co-reported drugs serving as confounders for the disproportionality signals.

Scenario 1: We have simulated 1000 cases of a hypothetical AE for galcanezumab (500 cases) and a control (500 cases) from the multivariate normal distribution of three Gaussians [26] as follows: age~N(40,1), weight~N(75,1), and number of co-reported drugs~N(2,1). The correlation between age and weight was −0.8, the correlation between age and the number of co-reported drugs was 0.7, and the correlation between weight and the number of co-reported drugs was −0.9. We have then generated three binary variables (i.e., co-reported drug 1, co-reported drug 2, and co-reported drug 3) with marginal probabilities of 0.25–0.75, 0.75–0.25, and 0.5–0.5 with a binary variable named outcome. The variables co-reported drug 1, co-reported drug 2, and co-reported drug 3 were generated by thresholding a normal distribution. The threshold was the median of the numerical variable generated with a normal distribution as shown in Electronic Supplementary Material (ESM) 1. The correlations of the components were specified as a correlation matrix of the binary distribution [27]. Co-reported drug 1 was correlated with co-reported drug 2 (i.e., 0.514) while co-reported drug 3 was not correlated with co-reported drug 1 and co-reported drug 2 (i.e., −0.020 and 0.048). The variable outcome was correlated with co-reported drug 1 (i.e., −0.628) and co-reported drug 2 (−0.370) but not with co-reported drug 3 (i.e., −0.026).

This setup accounts for a scenario where co-reported drug 1 and co-reported drug 2 are correlated with the outcome, as they are disproportionally co-reported between galcanezumab and the control cases having a hypothetical AE. Additionally, in this scenario, co-reported drugs 1 and 2 are correlated variables with different strengths of correlation with the outcome. It is expected that the framework identified in the decision tree co-reported drug 1, as it is the co-reported drug with the strongest correlation with the outcome.

Scenario 2: We used the same setting as in scenario 1 but we reduced the sample size to 50 cases. This setup accounts for a scenario in which there is a limited amount of cases to perform disproportionality analysis. It is expected that the framework identified in the decision tree co-reported drug 1, as it is the co-reported medication with the strongest correlation with the outcome.

Scenario 3: We used the same setting as in scenario 1 but we dropped co-reported drug 1 from the dataset and changed the marginal probabilities of co-reported drug 2 to obtain a 20% disproportionality between galcanezumab and the control’s cases. This setup accounts for a scenario in which there is a relatively small disproportionality between cases and controls for co-reported drug 2. It is expected that the framework identified in the decision tree co-reported drug 2, as it is the co-reported medication with the strongest correlation with the outcome.

In all scenarios, we performed a descriptive analysis by presenting frequencies and percentages of binary variables and plotting the kernel density distribution of age, weight, and number of co-reported drugs and correlation matrix of variables in the analytical datasets. Additionally, we modeled a conditional inference tree to identify co-reported drugs that were disproportionally reported between galcanezumab and the control reporting the AE under investigation.

3 Results

3.1 Automatic Selection of Controls and Comparability

Table 1 provides an overview of drugs that were identified as potential control groups for galcanezumab. Erenumab, fremanezumab, amitriptyline, and topiramate were finally used as control groups for disproportionality analysis. In total, 86% of drug classes among those eligible for disproportionality analysis were identified. Supplementary Figs. 1, 2, 3, and 4 and Supplementary Table 1 presents the comparability of erenumab, fremanezumab, topiramate, and amitriptyline cases, respectively, in comparison with galcanezumab cases for density distributions of age, sex, weight (kg), reporter’s qualification, number of co-reported drugs, the year of reporting, and propensity score. The results of the sensitivity analysis for the computation of propensity score did not show major differences from the main analysis (ESM 2).

Table 1. Potential control groups of galcanezumab identified within and outside the chemical subgroup

3.2 Dismissal of False-Positive Disproportionality Signals

In total, 399 signals of disproportionate reporting were identified for galcanezumab when compared with fremanezumab (n = 22), erenumab (n = 111), topiramate (n = 130), and amitriptyline (n = 136). Supplementary Figs. 5, 6, 7, and 8, show heatmaps of RORs of all identified disproportionality signals when fremanezumab, erenumab, topiramate, and amitriptyline were used as controls, respectively.

On the basis of purely co-reporting, it was possible to dismiss 22 (20.00%), 19 (14.29%), and 18 (13.33%) of the disproportionality signals identified for galcanezumab in comparison with erenumab, topiramate, and amitriptyline, respectively. In all, 340 disproportionality signals were not dismissed. Among disproportionality signals that cannot be dismissed on the basis of purely co-reported drugs, we estimated a further 0%, 15.32%, 20.88%, and 23.35% reduction in the number of galcanezumab cases to undergo manual validation in comparison with fremanezumab, erenumab, topiramate, and amitriptyline, respectively, as these proportions of cases have co-reported drugs as alternative causes.

3.3 Framework Validation

Descriptive analysis of cases simulated in scenarios 1, 2, and 3 are presented in Supplementary Tables 2–4 and Supplementary Figs. 10–12. The results of the conditional inference trees are plotted in Supplementary Figs. 13–15. In all scenarios, the framework was able to correctly identify the co-reported drug simulated as an alternative cause for the disproportionality signal.

4 Discussion

4.1 Automatic Selection of Control Groups and Comparability

At the moment, neither the EMA nor the FDA have described any guidelines concerning the selection of controls in disproportionality analysis, however, it is well known and acknowledged that the control group influences the results of a disproportionality analysis [28, 29]. This is in part as disproportionality analyses are frequently conducted against a background of the rest of the database [30]. It has to also be admitted, however, that choice of an appropriate control can sometimes be nearly impossible in a given dataset and the subjectivity associated with selection of controls. Nevertheless, this approach seems promising in some circumstances, in particular when the automatic process proposed in this study for the selection of controls within and outside the chemical subgroup of galcanezumab showed an 86% success rate, as 6 of 7 chemical subgroups indicated for acute and/or preventative migraine were identified. The selection of controls was based on pharmacoepidemiological principles of trying to increase the similarity in case characteristics among compared drugs in disproportionality analysis by including controls within the same chemical subgroup and controls outside the chemical subgroup with the same therapeutic indication. To assess similarity among cases and controls, we compared the propensity score of being in a report having galcanezumab as the suspected drug built using covariates that were known to introduce confounding in disproportionality analysis [31]. As expected, the drugs within the same chemical subgroup (i.e., fremanezumab and erenumab) were the most comparable to the galcanezumab group as shown by higher overlapping between propensity score density functions. Topiramate and amitriptyline were not as comparable, as they had a smaller proportion of female participants, a fewer number of co-reported drugs, a difference in years of reporting, as topiramate and amitriptyline have been marketed for a longer period, and lastly a higher median age. Such differences may be explained by the use of topiramate and amitriptyline for other therapeutic indications (i.e., depression), which is known to affect older adults more than those receiving galcanezumab for migraine prophylaxis [32, 33]. An interesting observation that was made during this study was the correlation between the comparability of cases and the control groups and the number of disproportionality signals identified in the statistical analysis, as shown in Supplementary Table 1. In this regard, we hypothesized that the number of disproportionality signals is influenced by the comparability of the selected control group to the drug under investigation. Furthermore, we believe using the overlap of the propensity score to prioritize controls in disproportionality analysis will be problematic, as most drugs within the 4th level of the ATC code will be retained as controls in disproportionality analysis. Finally, we believe that other approaches may be more effective in computing propensity score, such as the use of co-reported drugs as individual covariates rather than the number of co-reported drugs and the restriction of the study period. However, further studies are needed to validate these hypotheses, and while theoretically this is the case, non-random reporting patterns may mean that this is not the case in practice.

4.2 Automatic Identification of Alternative Causes in Disproportionality Signals

Our AI-based approach introduced automation in the identification of confounders during signal validation by identifying co-reported drugs that were disproportionately co-reported among galcanezumab cases and control groups. The rationale for conducting this analysis relies on pharmacoepidemiological principles. To be an alternative cause in disproportionality analysis, a co-reported drug needs to be associated with the occurrence of the AE under investigation and needs to be disproportionally co-reported between cases and controls. Our analytical approach performs this evaluation by using a conditional inference tree to identify the disproportionality of co-reported drugs among galcanezumab and controls in ICSRs with the AE under investigation. Only in one analysis, or rather, when fremanezumab was used as the control group to galcanezumab, were we not able to dismiss any of the disproportionality signals using conditional inference trees. We believe that a plausible cause for this result is that fremanezumab was the control group with the highest comparability to galcanezumab (i.e., 68.49% overlap of density functions of the propensity scores).

4.3 Limitations

The necessary step of checking the presence of an AE in the SmPC of co-reported drugs is a major component of the aforementioned time-limiting activities in signal validation, and it is not addressed in the automatic process proposed in our framework. However, we anticipate that soon there will be ad hoc tools for this task. Of note, the commercial version of DrugBank already includes this data.

Other machine learning approaches other than conditional inference trees could have been tested to assess the disproportionality of co-reported drugs but, for this pilot project, we limited our choice to conditional inference trees on the basis of the reasons provided in Sect. 2.3.

Currently, we do not have a clear cutoff to define “good” or “bad” regarding the number of drug classes that should be correctly identified by our framework during the selection of viable controls for disproportionality analysis. Therefore, future work is needed to identify the optimal cutoff.

We have arbitrarily decided to use four controls in disproportionality analysis because we are aware that the number of disproportionality signals generated in the framework will proportionally increase with the number of controls chosen (Supplementary Fig. 9). However, future work is needed to determine the optimal number of controls in our framework. In this regard, a challenge will be whether the optimal number would vary by drug under consideration, but evaluation of this issue and developing a systematic framework is extensive and challenging. It should be emphasized that increasing the number of controls also increases the number of disproportionality signals. However, this is also true because the same threshold is used to define a disproportionality signal irrespective of the number of tests. In other words, the problem of multiplicity is ignored. We hypothesized that the increase in the number of alerts largely depends on which control is being progressively added. Hence, the result is not general and further studies are necessary to clarify this result.

Finally, it is important to highlight that 340 out of 399 disproportionality signals were not dismissed, and that still represents an important number of signals to validate.

5 Conclusion

We developed a new framework that includes AI and non-AI approaches to ease some of the most time-consuming and labor-intensive steps of signal detection and validation, such as the selection of controls and the identification of co-reported drugs that may serve as alternative causes for the reported AEs listed in disproportionality signals. Future work is needed to further assess the extent of use of our framework for other drugs and spontaneous reporting databases when performing routine signal detection and validation activities.