Introduction

B-cell chronic lymphocytic leukemia (CLL) is a clonal expansion of an antigen-experienced B cell (1,2) expressing CD5, CD19, CD23 and low levels of surface membrane immunoglobulin (smIg). Despite the fact that CLL is the most common leukemia in the western hemisphere, its biology and pathogenesis are still poorly understood. Furthermore, CLL is a remarkably heterogeneous disorder, with some patients surviving for decades, often without therapy and eventually dying of unrelated causes, and others have a rapidly evolving, fatal disease despite aggressive therapy (3,4).

Several studies of the smIgs of CLL cells showed that the immunoglobulin heavy variable (IGHV) and immunoglobulin kappa/lambda variable (IGK/LV) had undergone the process of somatic hypermutation (58) and that this process could be ongoing during disease evolution, thereby generating intraclonal IGHV and IGK/LV diversification (911). Furthermore, patients whose leukemic clone exhibit mutated IGHVs usually have less aggressive disease and live longer than patients with unmutated IGHVs (12,13). However, none of these studies addressed the molecular features of paired IGHV and IGK/LV segments of individual CLL clones.

Here we analyzed the mutation pattern of paired IGHV-diversity-joining (IGHV-D-J) and IGK/LV-joining (IGK/LV-J) rearrangements of leukemic cells from 193 CLL patients. Our data indicate that the mutation patterns of κ and λ CLL clones differ from each other. Moreover, by comparing CLL cells to normal B cells, which have been so far studied with respect to the κ isotype only (14), we identified a different somatic hypermutation pattern.

Materials and Methods

Patients, Leukemic Cells, Immunoglobulin Variable (IGV) Sequences and Analyses

After obtaining informed consent, heparinized venous blood was obtained from patients with CLL, and peripheral blood mononuclear cells were isolated. The diagnosis of CLL was on the basis of accepted clinical and immunophenotypic features (15). Rearranged IGHV-D-J and IGKV-J or IGLV-J paired segments were sequenced from cDNA of 218 CLL patients as described (5,6); in addition, 148 immunoglobulin (Ig) sequences (IGHV + IGK/LV) were retrieved from GenBank, bringing the total analyzed to 366. For our sequences, as well as sequences retrieved from GenBank, only samples with allelic exclusion of both IGHV and IGK/LV were included in the study. Sequences were analyzed using the V-QUEST tool at the international ImmunoGeneTics (IMGT) Information System® (https://doi.org/imgt.cines.fr) (initiator and coordinator: Marie-Paule Lefranc, Montpellier, France [16]). The 3′ end of IGK/LVs (coding for the light complementary determining region 3 [LCDR3]) was inspected visually to exclude mutations identified by V-QUEST that were actually due to the variable-joining (V-J) recombination process. Because in this study both IGHV and IGK/LV segments were sequenced and analyzed, we defined as unmutated CLL cases those patients with leukemic clones exhibiting <2% mutations in both segments, whereas patients whose leukemic clones exhibited somatic mutations of the IGHV and/or IGK/LV ≥2% were defined as mutated CLL (M-CLL) cases. The percentage of complementarity determining region (CDR) or framework region (FR) mutations was calculated on the basis of the number of base pairs in CDRs or FRs of the respective IGV genes. M-CLL clones were further subdivided into groups A or B as follows: if | % of IGHV mutation — % of IGK/LV mutation| < 3 → group A; if | % of IGHV mutation — % of IGK/LV mutation | ≥ 3 → group B. The three units of percent difference were chosen for two operational reasons: (i) they represent a difference between the two chains of at least eight mutations, which we considered reasonable for “discordantly mutated” Igs, and (ii) they represent the ordinal number that divides κ samples (the most numerous group) into two similarly sized groups (63 samples in group A and 53 in group B).

Detection, Classification and Scoring of Mutations

We assigned each rearranged IGHV and IGK/LV gene to the corresponding germline sequence using IMGT databases and the V-QUEST tool. Thereafter, both the germline and rearranged sequences were translated into amino acids and analyzed using in-house tools (17) performing the following steps: the sequences were numbered and aligned coherently with the IMGT unique numbering (18); alignments of the rearranged genes with the corresponding germline sequence were examined to detect replacement (R) and silent (S) mutations; mutations were then classified as occurring in CDRs or FRs according to IMGT definitions (18). Finally, an estimate of the conservative nature of each R mutation was obtained using the BLOSUM62 substitution matrix (19,20). We assigned to each mutation the corresponding BLOSUM62 matrix element, obtaining values ranging from −6 (nonconservative mutations, for example, N→W) to 4 (conservative mutations, for example, F→Y).

Statistical Analysis

Data are described as medians and ranges for continuous variables and as absolute and relative frequencies for categorical variables. The normality of data distribution was assessed by the Kolmogorov-Smirnov test. Spearman rank correlation was used to test the direction and strength of the relationship between the number of IGHV and IGK/LV mutations. The Mann-Whitney U test was used to compare sums of the scores obtained with the BLOSUM62 matrix analysis, computed separately for CDRs and FRs, between the IGHV and IGK/LV regions. Results were considered statistically significant when the P value was ≤0.05. All statistical tests were two-tailed. Statistical analyses were performed using the SPSS software (SPSS, Chicago, IL, USA) and the R software (https://doi.org/www.R-project.org) (21).

All supplementary materials are available online at www.molmed.org .

Results

Sample Description

We analyzed the sequences of the rearranged IGHVs and IGK/LVs of 366 IgM+ CLL patients. The cohort comprised 47.3% (173/366) unmutated CLL cases and 52.7% (193/366) mutated CLL cases. The 193 M-CLL cases were used to investigate the reciprocal mutation patterns of paired IGV genes. In this group, 60.1% (116/193) and 39.9% (77/193) of clones were κ or λ isotype expressing, respectively. The IGHV and IGK/LV repertoires were analyzed separately and compared with previously reported cohorts (6,8,22). No major differences were evident (Supplementary Figures 1 and 2).

Analysis of Mutations

The M-CLL group comprised patients displaying ≥2% mutations in at least one of the two IGVs (IGHV and/or IGK/LV). Notably, whereas only 3.6% (7/193) of the samples had an IGHV with <2% mutations associated with an IGK/LV with ≥2% mutations, 26.4% (51/193) had an IGK/LV with <2% mutations associated with an IGHV with ≥2% mutations.

With respect to the overall mutation frequency of IGHVs and IGK/LVs, we found a mean percentage of mutation of 6.12% and 3.65%, respectively, for κ isotype samples (n = 116), with an IGHV-to-IGKV ratio of 1.7. The percentage of mutations in IGHV and IGLV of λ isotype-expressing samples (n = 77) was 5.54% and 3.47%, respectively, with an IGHV-to-IGLV ratio of 1.6.

The number of R mutations in kappa/lambda complementary determining regions (K/LCDRs) was plotted as a function of the number of R mutations in heavy complementary determining regions (HCDRs); in addition, the number of R mutations in FRs of the rearranged IGK/LV was plotted as a function of the number of R mutations in FRs of their partner IGHV. This step was done for all samples together (Figure 1A) and separately for the κ isotype (Figure 2A) and λ isotype samples (Figure 2C). Interestingly, while a significant concordance in R mutations was found for both CDRs and FRs for λ isotype samples, a correlation was observed for κ isotype samples only for FRs and not for CDRs. We asked whether using the absolute number of mutations, without considering CDR and FR lengths, would lead to biased results. To this purpose, CDR mutations were expressed as a percentage of the total number of CDR base pairs and FR mutations as a percentage of the total number of FR base pairs (see Materials and Methods). These results were almost identical to those obtained by plotting the absolute number of mutations (Supplementary Figure 3). To allow comparison with published data (vide infra), the absolute number of mutations was used hereinafter.

Figure 1
figure 1

Scatter plot of the number of R and S mutations in CDRs and FRs of paired IGHV and IGLV segments of all CLL samples. (A) R mutations. (B) S mutations. Each dot represents one or more CLL samples; color ranges from light gray (one sample) to dark gray (four or more samples). rspsp), Spearman rank.

Figure 2
figure 2

Scatter plot of the number of R and S mutations in CDRs and FRs of paired IGHVs and IGK/LVs. (A) R mutations of the κ isotype samples. (B) S mutations of κ isotype samples. (C) R mutations of λ isotype samples. (D) S mutations of λ isotype samples. Each dot represents one or more CLL samples; color ranges from light gray (one sample) to dark gray (four or more samples). rspsp), Spearman rank.

To determine if the discordant pattern of R mutations was due to a differential action of the mutation mechanism on IGHVs and IGK/LVs or to selective antigenic pressure, we determined the distribution patterns of S mutations. For this purpose, the number of S mutations in the K/LCDRs was plotted as a function of the number of S mutations in the HCDRs; the same was done for the number of S mutations in FRs of the rearranged IGK/LV as a function of the number of S mutations in FRs of their rearranged partner IGHV for all samples (Figure 1B) and separately for κ isotype (Figure 2B) and λ isotype samples (Figure 2D). We found a statistically significant correlation between S mutations of IGHV and IGKV segments in both CDRs and FRs in pooled κ and λ samples (P = 0.026 and P = 0.0001, respectively). Significant correlation was observed between S mutations of IGHV and IGKV in both CDRs and FRs of κ isotype samples (P = 0.03 and P = 0.001, respectively). In λ isotype samples, a significant correlation was detected for FRs (P = 0.005), whereas a significant correlation was not found in CDRs (P = 0.42), although the small number of silent mutations in the CDRs of λ isotype samples should be taken into account.

To date, only one study has addressed the pairing of H and L chain variable regions among B cells of the normal repertoire (14); this pairing was carried out for only IGHV/IGKV pairs of IgM-expressing B cells from two healthy donors. These data indicate that IGHV R mutations correlated with IGKV R mutations in CDRs but not in FRs, the opposite mutation pattern that we have uncovered in CLL cell clones (see Figure 3 for a comparison of data).

Figure 3
figure 3

Comparison of the pattern of R mutations in CDRs (upper row) and FRs (lower row) of paired IGHVs and IGKVs between κ-expressing normal B-cell Igs (left column) and κ isotype CLL Igs (right column). Left: The scatter plots related to normal κ isotype B-cell repertoire are derived with permission from Brezinschek et al. (14). Right: scatter plots obtained from our data on κ isotype CLL samples (see Figure 1B). rspsp), Spearman rank.

While evaluating the percentage of mutation of paired IGHV and IGK/LV, we observed, in a proportion of patients, a relevant difference between the two chains. Therefore, we asked whether the absence of correlation of the number of mutations between IGHV and IGK/LV could be due to the presence of different populations within the patient cohort (for example, one with a similar number of mutations in both IGVs and the other with a notable difference in the number of mutations between the paired IGHV and IGK/LV segments). To exclude the possibility that two populations could bias the analysis, κ and λ isotype samples were divided into two groups using three units of percent difference (at least eight nucleotides; see Materials and Methods). Samples with a difference in the percentage of mutation of the two chains lower than three percentage units were considered “concordantly mutated” and assigned to group A (63 κ isotype samples and 52 λ isotype samples), whereas samples with a difference greater than three percentage units were classified as “discordantly mutated” and assigned to group B (53 κ isotype samples and 25 λ isotype samples) (Figure 4).

Figure 4
figure 4

Splitting CLL samples into groups A and B on the basis of different mutation percentages between IGHVs and IGK/LVs. Scatter plot of the percentage of mutations in IGHV and IGK/LV pairings for samples displaying a <3% difference between IGHV and IGK/LV mutation percentages (group A) or ≥3% difference (group B), respectively (see Materials and Methods for details). (A) All CLL samples. (B) κ Isotype samples. (C) λ Isotype samples. Each dot represents one or more CLL samples; color ranges from light gray (one sample) to dark gray (four or more samples).

The degree of concordance between the number of R mutations in the CDRs and FRs of the two chains in the two groups was then computed for κ and λ separately (Figure 5A). For κ isotype samples, no significant correlation between the number of IGHV and IGKV R mutations was present in CDRs, neither in group A nor in group B (see Figure 5A, upper row). However, for λ isotype samples, the concordance previously observed in R mutations of both CDRs and FRs, when the two groups were analyzed together, was lost in group B but not in group A (Figure 5A, lower row).

Figure 5
figure 5

Scatter plot of the number of R (A) and S (B) mutations in CDRs and FRs of IGHV and IGK/LV pairings in group A (<3% difference) and group B (≥3% difference). Each single dot represents one or more CLL samples; color ranges from light gray (one sample) to dark gray (four or more samples). (A) Upper row: κ isotype samples; lower row: λ isotype samples. (B) Upper row: κ isotype samples; lower row: λ isotype samples. rspsp), Spearman rank.

To evaluate the mechanism underlying the R distribution pattern in the two subgroups, the number of S mutations in the K/LCDRs was plotted as a function of the number of S mutations in the HCDRs; the same was done for the number of S mutations in FRs of the rearranged IGK/LV as a function of the number of S mutations in FRs of their rearranged partner IGHV for both groups A and B (Figure 5B). We found a statistically significant correlation between S mutations of IGHV and IGKV in both CDRs and FRs for group A (P = 0.01 and P < 0.0001, respectively), whereas there was no significant correlation in both CDRs and FRs for group B (P = 0.61 and P = 0.06, respectively). Analyzing the S mutation distribution in λ chains, we observed a significant correlation between IGLV and IGHV only in FRs of group A.

The results of these analyses were further verified by comparing them with data obtained by plotting percentages instead of absolute numbers. No differences were observed between the two analyses (Supplementary Figure 4).

We next investigated if the different patterns of mutation observed in IGKV and IGHV genes could be explained by a different pattern of R amino acid substitutions. For this reason, the amount of diversity introduced by R mutations was determined for CDRs and FRs of both IGHV and IGK/LV. This step was done using the BLOSUM62 matrix as a reference to measure the nature of each mutation, with a positive score indicating a conservative change and a negative score indicating a nonconservative change, with values ranging from +4 to −6 (see Materials and Methods). Values obtained for R mutations in the CDRs and FRs of each IGHV and IGK/LV were added and analyzed separately for samples of groups A and B (Figure 6). The results did not indicate a significant difference in the nature of R changes between IGHV and IGKV of group A, both in CDRs and FRs. However, in group B, changes in the kappa complementary determining regions (KCDRs) were more conservative than those in the HCDRs. The opposite was true for FRs, where there were more conservative changes in IGHVs. For λ isotype Igs, conservation in both CDRs and FRs was higher in IGHV than in IGLV in group A, whereas there was more conservation in the LCDRs than HCDRs in group B.

Figure 6
figure 6

Comparison of the sums of the BLOSUM scores between IGHVs and IGK/LVs in groups A and B. Sums of the BLOSUM scores, obtained with the BLOSUM62 matrix analysis, were computed separately for CDRs and FRs. In each graph, the box represents the interquartile range (25–75th percentile) and the line within this box is the median value. Bottom and top bars of the whisker indicate the variation range. P values are shown.

A schematic representation summarizing all of the above data is shown (Figure 7).

Figure 7
figure 7

Schematic summarizing of mutation frequency concordance and chemical features of R mutations in paired IGHV and IGK/LV of mutated CLL clones. The different findings obtained from the analysis of the correlation of mutations and from the analysis of trait conservation for the paired IGHV and IGK/LV of the CLL cohort are shown. The left column summarizes the data shown in Figure 1 and indicates concordance (yes) or not (no) in the number of R mutations in paired IGHVs and IGK/LVs. The central column summarizes data presented in Figures 4A and 5 and displays the presence or absence of R mutation concordance as well as the comparison of diversification features generated by R mutations in IGHV and IGK/LV segments. The right column summarizes the data in Figure 4B and displays concordance of S mutations between paired IGHV and IGK/LV.

Discussion

Our analyses demonstrate that the pattern of somatic hypermutation in the rearranged IGV genes expressed on the surface of leukemic B cells from CLL patients is different for κ versus λ isotype Igs and that the mutation pattern of κ IGVs is also different from that of normal B cells. These findings point to a selective pressure acting on mutations in KCDRs of CLL Igs.

Specifically, we found that the mutation pattern of κ-expressing CLL and normal B lymphocytes (14) are quite different. Normal B cells display a correlation between the number of IGHV and IGKV R mutations in CDRs but not in FRs (14), whereas the opposite was observed in CLL, where there was no correlation between the number of R mutations in HCDRs and KCDRs but there was a relationship between the number of R mutations in H and K FRs. This result held true also when samples were divided into two groups (“A” [concordantly mutated] and “B” [discordantly mutated]) and were analyzed separately. This scenario may reflect antigen-driven selection that protects KCDRs from R mutations, which would more likely change amino acid structure and antigen reactivity. Such a possibility was reinforced by analyses of the distribution patterns of S mutations in IGHV and IGKV segments. Indeed, the concordance between the relative numbers of S mutations of IGHV and IGKV in group A further suggests selective pressure acting on KCDRs. As expected, in group B, no correlation was observed because of the arbitrary mutation cut off applied.

Differently from κ isotype samples, the smIgs in λ isotype CLLs show a mutational pattern in accordance with the 3% cutoff criteria: in group A, the number of IGHV mutations correlated with IGLV mutations in both CDRs and FRs, whereas in group B, no correlation was found in either CDRs or FRs. The latter data support the hypothesis of Hershberg and Shlomchik (23) regarding strategies for κ and λ L chain variation. These authors observed differences in germline κ and λ L chain responses to mutation and attributed these differences to a need to balance diversity and stability in the immune response.

Since the low amount of R mutations observed in KCDRs may be balanced by impactful nonconservative amino acid changes, we verified the degree of conservation of R mutations in κ isotype samples divided into A and B groups. This analysis showed that there were no significant differences between IGHV and IGKV in either CDRs or FRs of group A; in group B, there were more changes in HCDRs than in KCDRs, whereas the contrary was observed in FRs.

Comparing the analysis of trait conservation of κ and λ isotype samples divided into the A and B groups, a different behavior between the two isotypes was evident. Of note, the FR conservation trait of κ and λ isotypes in group B was opposite that of group A, with much more diversification of κ than λ FRs, again in agreement with published data (23). Although we cannot compare our data with corresponding λ Igs of X-expressing B lymphocytes from normal subjects, because the latter have not been investigated in sufficient detail to provide correlative information, collectively our results suggest an important role for the KCDRs in binding of antigen, as opposed to LCDRs. This result indicates different B-cell receptor (BCR)/antigen interaction modalities between B lymphocytes expressing different L chain isotypes.

It is of note that the combined κ and λ CLL samples belonging to group B represent ∼40% of the cohort. These samples are characterized by an evident difference in the amount of mutations between IGHV and IGK/LV segments. In particular, in 75% of the cases, a mutated chain (≥2%) is associated with an unmutated chain (<2%). Theoretically, this difference could be explained by receptor editing: for example, the discordantly mutated Igs would represent Igs that have undergone receptor editing after the somatic hypermutation process. For example, it is possible that B cells in which R mutations do occur in KCDRs as a consequence of antigen drive are either lost from the responding/expanding repertoire because of changes in (auto)antigen reactivity or undergo L chain editing. Although the occurrence of receptor editing of human antigen-experienced IGV mutated B cells in the periphery is not definitive, nevertheless, a few studies support this notion (2426). Accordingly, it is tempting to speculate that receptor editing may account for discordantly mutated CLL Igs and that cells that undergo receptor editing are prime substrates for transformation to CLL (27).

Disclosure

The authors declare that they have no competing interests as defined by Molecular Medicine, or other interests that might be perceived to influence the results and discussion reported in this paper.