Background

In clinical trials of PD-1 or PD-L1 checkpoint immunotherapies, patients with NSCLC separate into groups that respond or do not respond to immunotherapy treatment [1,2,3,4]. Objective responses for NSCLC in these studies range from 19.0–23.0%. Patients are selected for immunotherapy based on immunohistochemistry (IHC) detection of PD-L1 reactivity. Positive PD-L1 reactivity in tumors is considered to be important to predicting the success of PD-1 and PD-L1 immunotherapy treatments [2, 5]. However, IHC results to detect PD-L1 reactivity can vary depending upon different IHC platforms, differences in anti-PD-L1 antibodies, differences in scoring systems, and differences in positivity cut-off values [4, 6,7,8,9]. In the Blueprint PD-L1 IHC Assay Comparison Project [10], similar antibody-specific differences were seen. In all, this variability presents challenges for using PD-L1 reactivity as a sole marker for diagnosis and as a marker to predict the success of PD-1 and PD-L1 immunotherapy treatments.

More likely, there is a complex profile of molecules that contributes to the regulation of PD-L1 and to the subsequent immunosuppressive effects that NSCLC cells have on immune cells [11]. Chae et al. suggested that reliable predictive molecules need to be identified that can be used to select patients who would benefit from immunotherapy, yet limit the exposure of patients who would not benefit or have adverse reactions [12]. Multifaceted predictive biomarker systems have also been proposed that contain input on PD-L1 expression, tumor mutations, and the roles of inflammatory cells to identify patients that would respond or not respond to immunotherapy treatment [13, 14]. For better treatment outcomes, there is a recognized need to develop additional methods that can identify a profile of molecules that contributes to the regulation of PD-L1 expression and affirms IHC PD-L1 positive reactivity. One approach is to use the influence of patient cell genomics on tumor cell signaling to identify the downstream effects on PD-L1 expression.

In this study, we hypothesized that patient tumor cell genomics influences cell signaling and the expression of PD-L1, chemokines, and immunosuppressive molecules. We also hypothesized that these profiles can be used to predict patient clinical responses. Rizvi et al. assessed the mutational profiles that determined sensitivity to PD-L1 blockade from patients with NSCLC treated with pembrolizumab [15] and we used the Rizvi et al. dataset to test our hypothesis. We first assessed the effect of patient genomics on the expression profile of 24 molecules: PD-L1, 9 chemokines, and 14 immunosuppressive molecules. Differences among patient-specific models reflected the input that their deleterious gene mutation profiles had on modeled signaling pathways and the expression of PD-L1, chemokines, and immunosuppressive molecules. Second, we used the expression profiles of these 24 chemokines and immunosuppressive molecules to sort patients into those that would or would not respond to PD-1 immunotherapy. The 9 chemokines were used to generate an index to predict dendritic cell infiltration and PD-L1 and the 14 immunosuppressive molecules were selected as tumor-derived molecules with a long list of reported immunosuppressive functions (Additional file 1: Table S1). Our results suggest that patient-specific chemokine and immunosuppressive molecule expression profiles can be used to accurately predict clinical responses thus differentiate among patients who would or would not respond to PD-1 immunotherapy.

Methods

Patient clinical characteristics and mutation profiles

This was a retrospective study and patient data, clinical characteristics, and exome sequencing information for each of 34 patients were obtained directly from Supplement Table 3 of the Rizvi et al study. study [15]. To maintain anonymity, a random string generator was used to create a new random, 6-character uppercase alpha numeric string for each patient. This blinded both the identities of the patients in this study and their link to the prior published dataset we modeled.

All patients had stage IV NSCLC and were treated at Memorial Sloan Kettering Cancer Center (n = 29) or the University of California at Los Angeles (n = 5) on protocol NCT01295827. All patients had consented to Institutional Review Board-approved protocols permitting tissue collection and sequencing by the co-authors in this study (Naiyer A. Rizvi and Timothy A. Chan). All patients initiated therapy in 2012–2013 and were treated at 10 mg/kg every 2–3 weeks. Five patients were treated at 2 mg/kg every 3 weeks. The overall response rate and progression-free survival were reported to be similar across dose and schedules. PD-L1 expression on NSCLC tumor cells and immune cells by IHC was reported and scored semi-quantitatively: ≥50.0% membranous staining was considered strong, 1–49.0% was considered weak, and < 1.0% was considered negative [15].

Exomes from each NSCLC patient were examined using FannsDB [16], FATHMM [16], Mutation Assessor [17], Polyphen [18], PROVEAN [19], and SIFT [20]. Gene mutations deleterious to gene function were identified (Additional file 2: Table S2). For example, there were 1192 gene mutations listed for patient SA97V5 and 36 mutations were deleterious to gene function (Fig. 1).

Fig. 1
figure 1

The schema for creating predictive computational simulation models to predict molecule responses and identify patients that would respond or not respond to PD-1 immunotherapy treatment using patient SA97V5 as a model example. Exome information from patient SA97V5 (a) contained 1192 total mutations with 36 deleterious mutations. This profile (b) was converted from a mutational profile to a computational format and annotated into the computational workflow to convert (c) a nontransformed model in the cancer network into (d) a patient SA97V5-specific simulation model. The patient SA97V5-specific simulation model was used to predict PD-L1 expression (e.g., 67.0% with respect to control), dendritic cell (DC) infiltration index (e.g., 23.8% with respect to control); and an immunosuppressive molecule expression profile (e.g., range − 1.9% to 56.5% with respect to controls) (e). Predicted expression responses were all used (f) to sort patients into groups that would respond or not respond to PD-1 immunotherapy treatment. SA97V5 was identified as a patient who would respond to PD-1 immunotherapy treatment. Numerous validation checks (g) occurred on the cancer network, the simulation model predictions, and the PD-1 match rates between the predicted responses and the patient clinical responses

Simulation models

A validated cancer network containing a database of proteins involved in cell signal transduction, metabolism, and epigenetics obtained from manual review of new and published research (Additional file 3: Figure S1) was used to create patient NSCLC-specific predictive computational simulation models. This approach modeled protein-protein interactions at each step in a signaling pathway using ordinary differential equations (ODE) [21] and to predict specific pathway output [22]. Pathway protein-protein interactions at each specific node were modeled as Michaelis-Menten equations that contained the reaction, enzyme, initial concentrations of protein intermediate reactants, and parameters of the reaction like Ka, Km, kcat, Vmax, etc. ODE were solved at each step by the Radau method [23]. To demonstrate this modeling approach, an annexure section of the PD-L1 pathway is illustrated showing the step-by-step details of the protein-protein interactions at each node in the pathway as an example of the modeling process that also occured in all of the other pathways (Additional file 4: Supplementary Materials and Methods and Supplement Table 3 of the Rizvi et al study). The cancer network and the schema for creating these simulation models, predicting molecule responses, and identifying those patients who would or would not respond to PD-L1 immunotherapy is shown in Fig. 1.

NSCLC models in the cancer network were created for each patient. At the initial step, models did not contain patient-specific deleterious gene mutation profiles and were simulated to reach a homeostatic steady state, which served as the control baseline for the molecules of interest. Then patient-specific deleterious gene mutation profiles were converted into a computational format and annotated into the NSCLC cancer network, simulated to induce the patient-specific cancer disease states, and used to predict the expression of PD-L1, chemokines, and immunosuppressive molecules. At the network level, mutations of oncogenes were represented as gain of function at the activity level and mutations of tumor suppressor genes were represented as a loss of function at the activity level unless explicit functionality of the mutation was known from published studies. Copy number variations such as amplifications and deletions were represented as over-expression or deletion of gene function at the expression level. The time required to achieve a patient-specific network varied depending upon on the complexity of the patient-specific deleterious gene mutation profile.

The modeled output contained the expression profiles of 24 molecules (e.g., PD-L1, 9 chemokines, and 14 immunosuppressive molecules). PD-L1 expression was reported as percent change calculated as ((D/C)-1)*100. C was the absolute value of the non-tumorigenic baseline control (μM) and D was the absolute value of PD-L1 obtained from the patient-specific cancer state network (μM) [24]. CCL2 [25], CCL3 [26], CCL4 [27], CCL5 [28], CCL11 [29], CCL20 [30], and CX3CL1 [31] expression were determined similarly. These chemokines are capable of trafficking dendritic cells into the tumor microenvironment. Individual chemokine percent expression values were given weightage and normalized to sum to 1. A dendritic cell infiltration index was then calculated to be the sum of each prediction % change * weightage (Additional file 6: Table S4). Finally, the expression of 14 immunosuppressive molecules thought to facilitate the ability of cancer cells to escape normal tumor surveillance was determined (Additional file 1: Table S1).

Patient-specific simulation model predictions were also assessed using Weka 3, a data mining software program in Java [32]. Weka 3 contained machine learning algorithms for data pre-processing; data classification, regression, clustering, and association rules; and data visualization. Using the predicted responses in Table 1, several machine-learning algorithms were implemented to learn prediction models (Additional file 7: Table S5).

Table 1 Discovery (n = 13) and Validation (n = 16) datasets of patients with non-small cell lung cancer containing PD-1 clinical response and patient-specific simulation models containing PD-1 predicted response, predicted PD-L1 expression, immunosuppressive biomarker expression, and predicted dendritic cell (DC) infiltration index

Clinical response projections

Differences among the expression of 14 molecules were used in a 3-step process to sort patients into those that would or would not respond to PD-1 immunotherapy (Fig. 2). Patients were sorted by their PD-L1 expression (Step 1), their dendritic cell infiltration index (Steps 2a and b), and their immunosuppressive molecule expression (Steps 3a and b).

Fig. 2
figure 2

A decision tree was used to identify PD-1 drug responder status. At step 1, 9 patients with PD-L1 expression below 29.0% were identified as PD-1 drug non-responders. The remaining 16 patients (including patient SA97V5) with PD-L1 expression equal to or greater than 29.0% proceeded to step 2. At Steps 2a and 2b, 2 patients with dendritic cell infiltration index values below 20.0% were identified as non-responders and 2 patients with index values greater than 60.0% were identified as PD-1 drug responders. Twelve patients with index values greater than 20.0% (including patient SA97V5), but less than 60.0% proceeded to step 3. At Step 3, 4 patients with immunosuppressive molecule (ISM) values higher than that of their PD-L1 expression with a margin of greater than 5.0%, were identified as non-responders (Step 3a) and 8 patient-specific models with values lower than that of their PD-L1 expression with a margin of greater than 5.0% were identified as responders (Step 3b, including patient SA97V5). Mismatch patients GI7AGZ, 2FCOH7, F3FK2W were not listed

Signal pathways

Graphic representations of the simulation model networks for each patient-specific model and the underlying network relationships were created as previously described [24] to identify similarities in patient-specific signaling pathways and to identify the influence of the pathway intermediates altered by the patient deleterious gene mutation profiles.

Simulation model validations

A series of internal control check analyses were used to validate the cancer network input and output data. These control checks monitored a) the effects of select pathway molecule over-expression or knockdown on pathway predictions, b) the effects of select drugs on pathway predictions, and c) the effects of activation, regulation, and cross-talk interactions among pathway intermediates on pathway predictions.

A cross-validation approach was used to assess the match scores of the PD-1 predicted responses against the PD-1 clinical responses in the Rizvi et al. 2015 Discovery dataset vs. the Validation dataset [14]. The datasets were then pooled and re-partitioned into two new Training and Test datasets. A similar cross-validation approach was then used to assess the match scores of the PD-1 predicted responses vs. the PD-1 clinical responses. Differences between the match rates of the PD-1 predicted responses and the PD-1 clinical responses were performed via chi-square test or Fisher’s exact as previously described [24]. All statistical tests utilized a 0.05 level of significance.

Results

Simulation models

In this retrospective study, we created 29 of 34 separate and patient-specific simulation models from the exome sequencing information for each of the 34 patients listed in Supplement Table 3 of the Rizvi et al study [15]. In the Discovery dataset, 13 of 16 patients had sufficient information to create simulation models and patients RYRJFL, IYXPLI, and GOFKQI did not (Additional file 2: Table S2). In the Validation dataset, 16 of 18 patients had sufficient information to create computational simulation models and patients 6NLFT5 and 32I5VC did not (Additional file 2: Table S2). The 5 patients with insufficient information were omitted from this study since their deleterious gene mutation profiles lacked driver genes and we were unable to achieve an increase in tumor phenotypes of proliferation and viability with the subset of gene aberrations reported. The remaining patients in the Discovery dataset (n = 13) contained 5 clinical responders and 8 clinical non-responders and the patients in the Validation dataset (n = 16) contained 6 clinical responders and 10 clinical non-responders (Table 1). Objective responses to PD-1/PD-L1 immunotherapies are known to vary. For example, objective responses to PD-1 immunotherapies for NSCLC were reported to range from 19.0–21.0% and objective responses to PD-L1 immunotherapies for NSCLC were reported to range from 10.0–23.0% [1, 2]. Hence the proportion of more non-responders than responders in this sample size was representative of the responses previously reported in larger patient populations.

Patient molecule responses

Results reported for patient SA97V5 were used as an example to clearly illustrate the process of model creation, model prediction, and model validation.

Modeled PD-L1 expression ranged from − 8.3% (patient 67K46M) to 185.5% (patient M9GYO4) (Table 1). Patient SA97V5 had a PD-L1 expression value of 67.0%.

Modeled chemokine expression was used to create a dendritic cell infiltration index. This index was a weighted function of the percentage change of each of the 9 individual chemokines (Table 1) and ranged from 4.20 (patient 67K46M) to 79.85 (patient DFZLO2). Patient SA97V5 chemokine expression for CCL2 (28.7%), CCL3 (14.3%), CCL4 (27.2%), CCL5 (13.4%), CCL7 (38.0%), CCL11 (36.9%), CCL20 (30.6%), CX3CL (32.9%), and CXCL14 (− 3.3%); formula details; and calculations for creating the index value of 23.9% are shown in Additional file 6: Table S4.

Modeled expression profiles for 14 immunosuppressive molecules, including those for patient SA97V5, are also shown in Table 1.

Clinical response projections

The expression of PD-L1, dendritic cell infiltration index, and immunosuppressive molecules were used in a 3-step process to sort patients into those that would or would not respond to PD-1 immunotherapy (Fig. 2).

At step 1, 9 patients with PD-L1 expression below 29.0% were identified as PD-1 drug non-responders (Figs. 2 and 3a). The remaining 16 patients with PD-L1 expression equal to or greater than 29.0% proceeded to step 2. Patient SA97V5 had a PD-L1 expression value of 67.0% and proceeded to step 2.

Fig. 3
figure 3

Patient-specific simulation models were used to predict the expression of PD-L1 (a) and at step 1 of the decision tree, 9 patients (black bars) with predicted PD-L1 expression below 29.0% (bold line) were identified as PD-1 drug non-responders. Patient SA97V5 had a predicted PD-L1 expression of 67.0%. Patient-specific simulation models were used to predict the expression of chemokines used to create a dendritic cell (DC) infiltration index (b). At step 2a of the decision tree, 2 patients (black bars) with index values greater that 60.0% (bold line) were identified as PD-1 drug responders and 2 patients (black bars) with index values less than 20.0% (black line) were identified as PD-1 drug non-responders. Patient SA97V5 had a predicted DC infiltration index of 23.9%

At Step 2, 2 patients with dendritic cell infiltration index values below 20.0% were identified as non-responders (Figs. 2 and 3b) and 2 patients with index values greater than 60.0% were identified as PD-1 drug responders (Figs. 2 and 3b). One mismatch occurred at this step. Clinical non-responder GI7AGZ with a dendritic cell index of 64.9% was misidentified as a PD-1 responder (Fig. 3b). Twelve patients with index values greater than 20.0%, but less than 60.0% proceeded to step 3. Patient SA97V5 had a dendritic cell infiltration index of 23.9% and proceeded to step 3.

At Step 3, 4 patients with immunosuppressive molecule values higher than that of their PD-L1 expression with a margin of greater than 5.0% were considered to be non-responders (Fig. 2, Step 3a) and 8 patient-specific models with values lower than that of their PD-L1 expression with a margin of greater than 5.0% were considered to be responders (Fig. 2, Step 3b). Three mismatches occurred at this step. Clinical responder patient 2FCOH7 had an immunosuppressive molecule expression profile of a non-responder: vascular endothelial growth factor (VEGF), cytotoxic T-lymphocyte-associated protein 4 (CTLA4), ganglioside GM3 (GM3), and ganglioside GD2 (GD2) were all higher than that of PD-L1 with a margin of greater than 5.0%. Clinical non-responder patients F3FK2W and 6QFSVV had immunosuppressive molecule expression profiles of responders. Patient SA97V5 had all 14 immunosuppressive molecules below the threshold of PD-L1 and was identified as a PD-1 drug responder (Figs. 2 and 4a) and patient QIA43T had molecules TGFB1 and IL6 above the threshold of PD-L1 and was identified as a PD-1 drug non-responder (Figs. 2 and 4b).

Fig. 4
figure 4

Patient-specific simulation models were used to predict the expression of 14 immunosuppressive molecules. At step 3 of the decision tree, patients with immunosuppressive molecule predictions higher than that of PD-L1 with a margin of greater than 5.0% (bold line), were considered to be non-responders and patient-specific models with predictions lower than that of PD-L1 with a margin of greater than 5.0% were considered to be responders. Eight remaining patients were identified as responders and 4 remaining patients were identified as non-responders. Patient SA97V5 (a) had all 14 molecules below the threshold and was identified as a PD-1 drug responder. Patient QIA43T (b) had 2 molecules above the threshold and was identified as a PD-1 drug non-responder

Patient-specific model predictions in the Discovery and Validation datasets were also checked using Weka 3 [32] and SMO support vector machine with a normalized polynomial kernel had the best performance (Additional file 7: Table S5). The relationship between PD-L1 expression and predicted TGFB1 expression using Weka 3 algorithms for all patients in the dataset is shown in Additional file 8: Figure S2 and similar trends were seen when comparing the PD-L1 expression level to the other 13 predicted molecules. Weka 3 correctly identified 24 out of 29 patients whereas the computational simulation models correctly identified 25 of 29 patients.

Model validation

In the cross-validation analysis of the match scores between the Rizvi et al. 2015 Discovery and Validation datasets, there were no significant differences between the match scores of non-responders and responders in the PD-1 clinical response group (38.5% vs. 37.5%; p = 0.9577, Additional file 9: Table S6) and PD-1 predicted response group (30.8% vs. 56.3%; p = 0.2642, Additional file 9: Table S6). Even though the Discovery dataset had a higher match score rate among the PD-1 clinical response group and the PD-1 predicted response group than the Validation dataset (92.3% vs. 81.2%, respectively), there was no significant difference between the two datasets (p = 0.6059).

Similarly, in the cross-validation analysis of the match scores between the Training and Test datasets, there were no significant differences between the match scores of non-responders and responders in the PD-1 clinical response group (38.9% vs. 36.4%; p = 0.9999, Additional file 9: Table S6) and PD-1 predicted response group (44.4% vs. 45.5%; p = 0.9577, Additional file 9: Table S6). In the Training dataset, the PD-1 predicted responses had an 83.3% match score with the PD-1 clinical responses and in the Test dataset the PD-1 predicted responses had a 90.9% match score with the PD-1 clinical responses. Again, there was no significant difference between the two datasets (p = 0.9999).

Predicted pathway comparisons

Deleterious gene mutations in patients were mapped to unique and common signaling pathways involved in PD-L1 expression (Fig. 5). Common pathways were utilized among a number of patient-specific models. Mutations in patient C9TGAJ (kirsten rat sarcoma viral oncogene homolog, KRAS mutation), patient RDD2UW (KRAS mutation), patient M9GYO4 (mitogen-activated protein kinase kinase 2, MAP2K2 mutation), and patient DFZLO2 (mitogen-activated protein kinase kinase kinase 1, MAP3K1 mutation) altered the extracellular signal-regulated kinase (ERK) activation pathway. Mutations in patient P90A0O (B-Raf proto-oncogene 1, BRAF1 mutation and tumor protein p53, TP53 mutation) and patient L8MTGU (KRAS mutation) altered ERK activation and apoptotic pathways. Mutations in patient SA97V5 (breast cancer anti-estrogen resistance protein 1, BCAR1 mutation; ankyrin-2, ANK2 mutation; insulin receptor substrate 1, IRS1 mutation; and cAMP response element-binding binding protein, CREBBP mutation), patient 26YMUF (rapamycin-insensitive companion of TOR, RICTOR mutation and ERK mutation), and patient L6ADEL (TP53 mutation and TNF receptor-associated factor, TRAF3 mutation) all had non-KRAS and non-B-Raf proto-oncogene (BRAF) driven activation pathways of PD-1 drug responder status.

Fig. 5
figure 5

The expression of PD-L1 was influenced via a number of signaling pathways. Activating signals were processed via the ERK signaling pathway (via EGFR; B-Raf proto-oncogene, serine/threonine kinase, BRAF-V600E; mitogen-activated protein kinase kinase 1/2, MEK1/2; mitogen-activated protein kinase kinase 1, MAP2K1; MAP2K2; ERK1/2; mitogen-activated protein kinase 3, MAPK3; mitogen-activated protein kinase 1, MAPK1; and Jun proto-oncogene, c-Jun). Activating signals were processed via the EGFR signaling pathway (via neuroblastoma RAS viral oncogene homolog, NRAS; phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha, PIK3CA; V-akt murine thymoma viral oncogene homolog, AKT; mechanistic target of rapamycin, MTOR; and STAT3). Also, activating signals were processed via the interferon gamma (IFNG) pathway (via IFNG; interferon gamma receptor 1, IFNGR1; signal transducer and activator of transcription 1, STAT1; and interferon regulatory factor 1, IRF1). Pathway signals converge to activation factors Activator protein 1 (AP1), STAT1, STAT3, and IRF1 leading to transcription of PD-L1 genes. Common pathways were utilized among a number patient-specific simulation models. Patient C9TGAJ (KRAS mutation), patient RDD2UW (KRAS mutation), patient M9GYO4 (MAP2K2 mutation), and patient DFZLO2 (MAP3K1 mutation) involved the ERK activation pathway. Patient P90A0O (BRAF1, TP53 mutations) and patient L8MTGU (KRAS, TP53 mutations) involved the ERK activation and apoptotic pathways

Models also reinforced the association between PD-L1 expression and the presence of a KRAS mutation (or high ERK activation). Thus if a patient responder was found to have a KRAS mutation or positive regulator around mitogen-activated protein kinase kinase (MEK) pathway, this may identify a means to regulate PD-L1 by the MEK mediated pathway. A KRAS/BRAF/MEK related mutation in the profile leads to stronger expression of PD-L1 in the profile. Patients 195P5D, J0T9TJ, and 6QFSVV had KRAS mutations but were non-responders to PD-L1 inhibitor indicating complex and additional factors and pathways driving PD-L1 expression and response to the checkpoint inhibitor.

KRAS mutations and PD-L1 expression

To further affirm the association between the presence of KRAS mutations or KRAS co-mutations and PD-L1 expression, 2 additional datasets [33, 34] were modelled beyond the Supplement Table 3 of the Rizvi et al study [15].

KRAS mutations in lung adenocarcinoma were reported to be associated with co-mutations in TP53. In modeled simulations of the Dong, et al. dataset [33], the KRAS+TP53 co-mutation (KP subgroup) was predicted to increase PD-L1 expression. The KRAS+TP53 co-mutation had higher levels of predicted PD-L1 expression than the KRAS mutation and TP53 mutation alone.

KRAS mutations in lung adenocarcinoma were also reported to be associated with co-mutations in STK11/LKB1 (the KL subgroup) [34]. In modeled simulations of the Skoulidis, et al. dataset [34], KRAS+STK11+KEAP1 co-mutation was predicted to reduce PD-L1 expression. The KRAS+STK11+KEAP1 co-mutation had lower levels of predicted PD-L1 expression than the KRAS+TP53 co-mutation.

KRAS mutations in lung adenocarcinoma were reported to be associated with co-mutations in TP53 (KP subgroup) and CDKN2A/B [34]. In modeled simulations of the Skoulidis, et al. dataset [34], KRAS+CDKN2A/B co-mutations (KC subgroup) were predicted to reduce PD-L1 expression. KRAS+CDKN2A/B co-mutation had lower levels of predicted PD-L1 expression than the KRAS+STK11+KEAP1 co-mutation and the KRAS+TP53 co-mutation.

Discussion

In this retrospective study, we used a recent dataset from NSCLC patients treated with pembrolizumab and identified deleterious gene mutational profiles in patient exomes. We annotated the deleterious gene mutational profiles into a cancer network to create NSCLC patient-specific predictive computational simulation models. We used these models as a tool to identify and validate a profile of 24 chemokines and immunosuppressive molecules that could accurately affirm expression of PD-L1 and predict patient clinical responses to PD-1 immunotherapy. We found that patient tumor cell genomics influenced cell signaling and altered the expression of PD-L1, 9 chemokines, and 14 immunosuppressive molecules. We also found that expression profiles of these 24 chemokines and immunosuppressive molecules could be used to identify patients who would or would not respond to PD-1 immunotherapy. Adding chemokine and immunosuppressive molecule expression profiles to a predicted PD-L1 profile allowed models to achieve a greater than 85.0% predictive correlation among predicted and reported patient clinical responses. This differentiated patients who would and would not benefit from PD-1 or PD-L1 immunotherapies. To validate our results, we used retrospective correlation of our simulation models against patient genomic signatures and clinical outcome data that was available in the NSCLC cohort of the Rizvi et al. study [15]. We also used Weka 3 to validate predictions determined using the predictive computational simulation models. The Weka 3 results were similar to that generated via machine-learning methods and the Chi-square test was used to show no differences among the match rate results in these datasets. It is important to note that expanding PD-L1 expression profiles to include 23 additional chemokine and immunosuppressive molecule expression responses allowed models to achieve a greater than 85.0% correlation among predicted and reported patient clinical responses.

The 24 molecules used in this study have immunosuppressive properties. The role of PD-L1 in tumor pathogenesis is well known. Increased expression of PD-L1 on tumor cells inhibits T-cell proliferation, reduces T-cell survival, inhibits cytokine release, and promotes T-cell apoptosis [3, 35,36,37,38]. This leads to T-cell exhaustion and adaptive tumor immunosuppression [39].

Cytokines also have a role. Cytokine scores were recently found to be associated with overall survival in CheckMate 017 and 057 (both for nivolumab and docetaxel treated patients) [40]. In this present study, 9 chemokines were selected that chemoattractant dendritic cells [41,42,43]. In other studies, dendritic cells were present in NSCLC tumors [42, 44] and dendritic cell infiltration was reported to be an independent prognostic factor for NSCLC [44]. The simulation models here captured the role of dendritic cells by predicting chemokine expression in the form of a functional dendritic cell index (Additional file 6: Table S4).

Fourteen molecules were included that have pleiotropic functions including immunosuppressive properties (Additional file 1: Table S1). IL6, can prevent dendritic cell maturation, prime tumor-specific T-cells via signal transducer and activator of transcription 3 (STAT3) signaling, inhibit NF-κB binding activity, and inhibit C-C chemokine receptor type 7 (CCR7) expression [45,46,47]. IL10 can impair dendritic cell function and protect tumor cells from cytotoxic T-cell-mediated cytotoxicity by downregulating transporter-associated with antigen processing (TAP)1 and TAP2 [48, 49]. TGFβ can alter immune surveillance of regulatory T-cells. It represses CTL-mediated tumor cytotoxicity by altering the expression of perforin, granzyme A, granzyme B, Fas ligand (FASL), and IFNγ [49,50,51]. VEGF is a marker of tumor invasion and metastasis and can inhibit maturation of dendritic cells [44, 52]. IDO, a tryptophan-metabolizing enzyme that limits tryptophan, inhibits the proliferation of lymphocytes, and contributes to peripheral immunologic tolerance [53,54,55,56]. Increased IDO production by cancer cells down regulates natural killer (NK) receptors and induces NK cell apoptosis. It induces cell cycle arrest, decreases activation, and increases apoptosis in cytotoxic T-cells. Tryptophan 2, 3-dioxygenase 2 (TDO2) inhibits tryptophan 2,3-dioxygenase [57]. Prostaglandin E2 (PGE2) suppresses NK cell function through the E2 prostaglandin receptor 4 (EP4) [58]. Lectin, galactoside-binding, soluble, 9 (LGALS9) mediates T-cell dysfunction and T-cell senescence [59]. Cluster of Differentiation 47 (CD47) is a negative regulator of dendritic cells binding to signal regulatory protein (SIRP) on dendritic cells and directly repressing dendritic cell phagocytosis, maturation, and production of IFNγ [47]. CTLA4 restrains the adaptive immune response of T-cells towards tumor-associated antigens [60,61,62]. Gangliosides GM3 and GD2 induce monocyte apoptosis and impair differentiation to dendritic cells [63].

Using a 24 molecule expression profile allowed computational models to achieve a greater than 85.0% predictive correlation. However, this list was not exclusive and incorporating additional molecules into patient NSCLC-specific expression profiles may have merit and improve computational model accuracy. These included Lymphocyte activation gene-3 (LAG-3), T cell immunoglobulin-3 (TIM-3), and T cell immunoglobulin and ITIM domain (TIGIT) co-inhibitory receptors [64]. LAG-3 is a co-inhibitory receptor upregulated on activated CD4+ T cells, CD8+ T cells, and subsets of natural killer (NK) cells [64,65,66]. It impairs T cell proliferation and cytokine production and alters NK cell cytotoxicity and cytokine production. TIM-3 is a cell surface molecule expressed on IFNγ-producing CD4+ T helper 1 cells, CD8+ T cytotoxic 1 T cells, NK cells, monocytes, and dendritic cells [64]. TIM-3 dampens the development of protective immunity and TIM-3 blockade improves cell function. In patients with NSCLC, co-blockade of the TIM-3 and PD-1 pathways suppresses tumor growth. TIGIT is another co-inhibitory receptor expressed on NK cells, T cells, and Treg cells [64]. CD155, CD112, and TIGIT ligands suppress immune responses through CD155 on dendritic cells. TIGIT is thought to work with PD-1 and TIM-3 to attenuate T cell responses and promote T cell dysfunction.

After the NSCLC patient-specific predictive computational simulation models were created and the profiles of 24 chemokines and immunosuppressive molecules were predicted, we created a decision tree to identify patients who would or would not respond to PD-1 immunotherapy. Decision cutoffs were established at 29.0% PD-L1 expression (Step 1), < 20.0% dendritic cell infiltration (Step 2a), > 60.0% dendritic cell infiltration (Step 2b), and immunosuppressive molecule expression as < PD-L1 with a margin of greater than 5.0% (Step 3) (Fig. 2). The decision tree was robust and had built-in redundancy. Basing the PD-L1 drug responder status on 3 separate predicted criteria allowed a responder/non-responder not identified at one step to be identified at a later step. Also the thresholds were specific. At 29.0% PD-L1 expression (Step 1), 9 non-responder patients were identified. Decreasing the PD-L1 expression cutoff from 29.0% to 25.0% identified only 6 non-responders. Increasing the PD-L1 expression cutoff from 29.0% to 35.0% identified a number of false negatives and setting the PD-L1 expression cutoff at 35.0% identified up to 13 non-responder patients: the three additional patient responders L8MTGU, P90A0O, and 26YMUF would be falsely identified as non-responders.

A diversity of signaling pathways are reported to be involved in the expression and regulation of PD-L1 [67,68,69] and these pathways were observed in expression and regulation of PD-L1 in this study (Fig. 5). Responder patients had mutations around the rapidly accelerated fibrosarcoma (RAF)-rat sarcoma (RAS)-ERK pathway including KRAS/BRAF and MEK-related mutations that predicted the profiles to have stronger expression of PD-L1. However, the presence of KRAS cannot be the only criteria for predicting strong expression of PD-L1 and thus a likely PD-1 drug responder, since there were non-responder profiles that also had KRAS mutations. We observed that matched predicted and clinical non-responder patients 195P5D and J0T9TJ and mismatched predicted responder and clinical non-responder patient 6QFSVV all had KRAS mutations (Additional file 2: Table S2, Fig. 5).

Recent studies support the concept that NSCLC is not a homogeneous disease and at least 3 subtypes of KRAS mutations involving LKB1 or TP53 can be identified. The tumors with these mutations have different PD-L1 expression patterns (higher in KRAS mutations and TP53 mutations) and different sensitivities to immune checkpoint blockade. Thus the effects of KRAS mutations and KRAS co-mutations on PD-L1 expression was further assessed using 2 additional datasets [33, 34] beyond the Supplement Table 3 of the Rizvi et al study [15].

Dong, et al. [33] reported that TP53 and KRAS mutations may predict which patients would or would not respond to PD-1 immunotherapy. Modeling the dataset in their study, we predicted that KRAS+TP53 co-mutation (KP Subgroup) would lead to increased PD-L1 expression. Skoulidis, et al. [34] reported that KRAS mutations in lung adenocarcinoma were associated with co-mutations in STK11/LKB1 (the KL subgroup) [34]. KL tumors had high rates of KEAP1 mutations with lower PD-L1 expression. Modeling the dataset in their study, we predicted that KRAS+STK11+KEAP1 co-mutations (KL Subgroup) also would lead to reduced PD-L1 expression. We predicted that KRAS+CDKN2A/B co-mutation (KC Subgroup) would lead to reduced PD-L1 expression. There was a reduction in positive regulation due to reduction in AMPK, mTOR pathway and also due to KEAP1 loss of function. There was an increase in the WT TP53 mediated inhibitory regulation of PD-L1 expression. This was a novel finding based on network analysis.

Furthermore, PTEN deletion [70], PI3K mutations [71] and MYC overexpression [72] have also been recently characterized as oncogenic mechanisms leading to PD-L1 expression.

The techniques described in this retrospective study have application. Although the techniques were complicated and need more extensive validation with larger datasets, their utility in clinical practice is possible. Profiling of tumors is becoming more main stream for precision personalized medicine. The approach may not necessarily be expensive, but in fact provides more utility to the generated profiling data for most tumor samples.

Conclusions

Patient tumor cell genomics were found to influence cell signaling with downstream effects on the expression of 24 chemokines and immunosuppressive molecules. This allowed us to establish patient-specific profiles of these molecules that could be used to predict patient clinical responses with greater than 85.0% correlation among predicted and reported patient clinical responses. Developing a workflow incorporating immunosuppressive molecules could a) be used as a potential complementary assay to affirm IHC results or used as an alternate assay where IHC in unfeasible, b) affirm patient PD-1 and PD-L1 drug responder status, c) as a method to determine influencing factors on PD-L1 expression, and d) as a potential clinical decision support system facilitating selection of therapies based on individual patient mutational profiles. The latter application used shortly after cancer diagnosis and just before cancer treatment could generate important patient-specific treatment options that could assist clinicians in selecting appropriate mono-therapies or combination therapies.