Background

Epstein Barr virus (EBV) is a well known infectious cofactor involved in the development of several malignancies, including endemic Burkitt lymphoma (eBL) and nasopharyngeal carcinoma (NPC) (reviewed in [1]). Still under question, however, is how EBV functions to drive malignancy. One possibility is that genetic variation in EBV leads to immune evasion of virally infected cells.

EBV encodes a number of genes that contribute to maintaining cell proliferation, blocking apoptosis, and contributing to the malignant phenotype of cancer cells [25]. One of the main EBV encoded oncogenes is latent membrane protein-1 (LMP-1) [6]. Latent membrane protein-1 is expressed during primary B cell infection, functioning as a constitutively active CD40 homolog and affecting many cellular proteins including TRADD, JAK3, PI3K, and RIPs [4, 7, 8]. Overexpression of LMP-1 in EBV-negative cell lines has shown that LMP-1 blocks apoptosis, increases cytokine production, cellular migration and transformation, and decreases cellular adhesion [8, 9]. The structure of LMP-1 includes six transmembrane regions starting at the N terminus, with a long cytoplasmic tail containing three C terminal activating regions (CTAR), responsible for activating signaling cascades (Figure 1) [8].

Figure 1
figure 1

Diagram of LMP-1 structural and functional motifs. Cytoplasmic terminal activating regions are labeled CTAR1-3 and labeled with their corresponding amino acid numbers. The region that we sequenced is labeled, along with the positions of amino acid mutations in the K variant sequence, designated with *. The 10 amino acid deletion associated with NPC is labeled with X. The T cell epitope region of CTAR3 is labeled TCE, the JAK3 binding region is labeled JAK3, and the TRADD motif of CTAR2 is labeled TRADD.

Genetic variation of LMP-1 has been classified using different schemes [1013]. These schemes were developed from sequences of different geographic areas and cellular origins. Sandvej and colleagues published the first of these classification schemes using a variety of healthy European sequences [12]. In this study Sandvej and colleagues identified 4 variants of LMP-1 sequences in healthy Caucasians and labeled them A, B, C, and D [12]. The most frequent LMP-1 variant observed was variant A (41.2%), followed by variant C (26.5%), variant D (17.6%), variant B (11.8%), and uncharacterized (2.9%) [12]. Previous sequencing studies had been performed using tumor tissue rather than peripheral blood from healthy individuals [14], potentially selecting for certain viral sequences.

Mutations and deletions within the CTARs of LMP-1 have been associated with disease [1517]. Specifically, a 10 amino acid deletion mutant of LMP-1 as compared to the prototypical B95.8 EBV strain has been associated with NPC cases in Asia, Europe, and North Africa [1820]. In a retrospective study of EBV-positive lymphoproliferative disorders, the LMP-1 deletion mutant was linked to malignant phenotypes [21]. Deletions in LMP-1 have also been associated with other types of EBV-positive lymphomas [2224]. One study of children in Turkey with Burkitt lymphoma reported a high frequency of the larger 69 base pair deletion variant of LMP-1, but this study did not compare incidence to healthy controls [25]. A study in Brazil reported that a similar high proportion of Burkitt lymphoma patients and controls harbored deletion variants of LMP-1 [26]. Other studies have examined the association of EBV variants with eBL and produced conflicting results [13, 2730]. Focused studies on EBV variation in eBL patients relative to healthy controls are needed to clarify these divergent observations. To our knowledge, no study has examined the extent of genetic diversity of LMP-1 in an area endemic for BL or in eBL patients.

Genetic variation in LMP-1 has been shown to correlate with differences in T cell immunity [3133]. Two ways that variant LMP-1 can decrease T cell immunity are through enhancement of regulatory T cells (Tregs) and immune evasion. The role of Tregs in NPC was examined by Pai et al. wherein an NPC-associated LMP-1 variant failed to stimulate T cells as effectively as wildtype LMP-1 in a mixed lymphocyte reaction [33]. The NPC-associated LMP-1 variant led to enhanced IL-10 production by antigen presenting cells, enhancing regulatory T cell function and reducing T cell responses to LMP-1 [33]. LMP-1 is also a target for EBV cytotoxic T lymphocytes (CTL) and has well described T cell epitopes [32, 34]. Duraiswami and colleagues showed that there are 6 LMP-1 peptide sequences that stimulate LMP-1 specific T cells to produce IFN-γ. Each of these regions was broken down into the minimal peptide sequences that were T cell epitopes. One of the T cell epitope regions within LMP-1 falls within CTAR3 [34], an area with known sequence variation [11, 12, 35]. A sequencing study of LMP-1 T cell epitopes from NPC patients showed no association with disease, however it has not been shown whether LMP-1 variation within the T cell epitope region is associated with immune evasion in eBL [34]. While LMP-1 is not expressed in eBL, T cell control of EBV during primary infection of B cells may be impaired by different LMP-1 variants.

The current study sought to answer several outstanding questions. First, what is the diversity of LMP-1 sequence variation in an area endemic for eBL? Second, are certain LMP-1 genotypes associated with eBL compared to healthy controls? Finally, what does LMP-1 variation suggest about EBV pathogenesis? To answer these questions the C terminus of LMP-1 was sequenced from eBL patients and healthy controls from an eBL endemic area of western Kenya. A novel LMP-1 variant was observed in the Kenyan population, was highly prevalent in eBL patients, and carried mutations in the C terminal amino acids of both minimal T cell epitopes found in the portion of LMP-1 studied. These results may have implications for EBV-mediated immune evasion in the early events of Burkitt lymphomagenesis.

Results

Study populations

Endemic Burkitt lymphoma patients and healthy controls were selected based on their availability from our previously reported case control study [36]. In this study only 13% of eBL patients were parasitemic by blood smear at admission, although nearly all resided in a malaria holoendemic area [37]. Also 28% of parents reported giving their child antimalarial treatment in the two weeks prior to presentation (Moormann, unpublished observation). Therefore point prevalence malaria status for eBL patients at presentation to this tertiary care hospital is not an accurate indicator of recent malaria. We have previously reported that 68% of this group of healthy controls were malaria positive at sampling [36]. Additional controls (C17-C24) were included from a nearby area of western Kenya [38], and of these 57% were PCR positive for malaria. Although acute malaria increases EBV load and possibly detectability [39], we were able to amplify EBV DNA from all eBL patients and healthy controls sampled, suggesting a low rate of detection bias of EBV. After sequencing it was pathologically determined that two eBL patients had tumors other than eBL (BL16 and BL39), and their sequencing data were excluded from the analysis but can be found in Additional file 1: Table S1. The mean age of eBL patients was 90 months and for healthy controls was 54 months. For eBL patients 56.8% were male and for healthy controls, 40.9% were male. A summary of demographic data on the study populations is shown in Table 1.

Table 1 Demographic characteristics of study participants

Coinfection with multiple EBV variants

Coinfection with different EBV LMP-1 deletion variants was determined by difference in the product size among clones. One eBL patient and two healthy controls had two discernible variants in LMP-1 size as determined by the size of the cloned PCR product when analyzed by gel electrophoresis (Figure 2). Both of the variants for the three study participants were sequenced and pooled with the results of the remaining sequences for analysis, resulting in 39 eBL sequences and 24 healthy control sequences.

Figure 2
figure 2

Gel electrophoresis image of plasmid digestion from three study participants. Lane 1 is the 100 base pair ladder, 500 bp has increased intensity. Lanes 2-6 are five clones from participant C2, lanes 7-11 are from participant C11-C12, and lanes 12-16 are from participant C13. The full-length product (~260 bp) is visible in all five clones from C2 and C13. The 30 base pair deletion mutant (~230 bp) is visible in two clones (lanes 7 and 9, C11) of participant C11-C12.

Diversity of LMP-1 sequence variants

The T cell epitope region of CTAR3 through the 30 base pair deletion region to the 3′ end of the LMP-1 gene that was sequenced is shown in Figure 1. Isolates were then categorized into the scheme defined by Sandvej and colleagues and also compared with the prototypic B95.8 strain of EBV [15, 35]. Because Sandvej et al. sequenced LMP-1 from many healthy Europeans [12], and compared the sequences to lymphoma patients [35], this classification scheme was chosen for the present study. In the present study of the C terminus of LMP-1, in contrast to Sandvej et al., variant A was not observed, while B, C, D, and B95.8 EBV LMP-1 variants were observed. Table 2 represents the full array of mutations observed in this study population, and the frequency of each variant in healthy control and eBL samples is shown in Figure 3. The only variant sequence represented exactly as described by Sandvej was the C variant, which was present in 15 (40.5%) eBL sequences and 7 (29.2%) control sequences (p=0.42, OR 1.65, 95% CI 0.55-4.97). However other variants could be characterized as similar to C type, differing only by single amino acid substitutions. These variants were denoted C’ and when combined with true C variant totaled 17 (45.9%) eBL samples and 10 (41.7%) healthy controls (p=0.80, OR 1.19, 95% CI 0.42-3.36). Thus no difference in the frequency of C variant was observed between eBL and healthy control sequences.

Figure 3
figure 3

Frequency of all LMP-1 variants between healthy control eBL patient samples. Bars represent the frequency of each LMP-1 type, including amino acid variants, e.g. K+K’. White bars represent eBL sequences and gray bars represent healthy controls.

Table 2 Location of all amino acid mutations present in this study

Variants of several other previously described LMP-1 isolates were observed, including B, D, and B95.8. There were no prototypical B strains, but 2 (7.7%) eBL sequences and 4 (16.7%) healthy control sequences differed by only one to two amino acids from the prototypical B strain. There was no significant difference in the proportion of B variant sequences between these two groups (p=0.20, OR 0.29, 95% CI 0.05-1.70). A single D variant strain, which differed from the prototypical D strain by two amino acids, was present in one healthy control sequence and no eBL sequences (p=0.39, OR 0.21, 95% CI 0.01-5.35). One prototypical B95.8 sequence occurred in an eBL patient. There were five B95.8 amino acid variants, 3 (12.5%) from healthy control sequences, and 2 (5.4%) from eBL sequences (p=0.37, OR 0.40, 95% CI 0.06-2.59). When these were analyzed together with the prototypical B95.8 sequence, no statistically significant difference in frequency of B95.8 variant was observed between eBL sequences and healthy controls (p=0.67, OR 0.62, 95% CI 0.11-3.35).

Presence of the 30 base pair deletion LMP-1 mutant detected by gel electrophoresis or by sequencing was compared and 100% concordance was observed between electrophoresis and sequencing studies in detecting the LMP-1 deletion (Figure 4, other data not shown). Next the frequency of the deletion mutant was compared between eBL cases and healthy controls. The 30 base pair deletion mutant was present in 17 (45.9%) eBL sequences and 10 (41.7%) healthy controls (p=0.80, OR 1.19, 95% CI 0.42-3.36).

Figure 4
figure 4

Confirmation of agreement between gel electrophoresis and sequencing result. Patient BL26 and BL28 contained the full-length LMP-1 product, while BL27, BL29, and BL30 contained deletion variants by both electrophoresis and sequencing. Part a is a sample gel electrophoresis image from a PCR amplification of five eBL patient LMP-1 sequences. Lane 1 is a 100 base pair ladder, with 500 base pairs highlighted. Lane 2 is from patient BL26, lane 3 is from BL27, lane 4 is from BL28, lane 5 is from BL29, lane 6 is from BL30. Lane 7 is a no template PCR control. Part b represents the sequence traces of the corresponding eBL patient samples flanking the 30 base pair deletion.

No mutations were observed in the TRADD/RIP binding sequence of CTAR2, which occurs from amino acids 379-385 of LMP-1. Of the 63 sequence reads, 55 produced clean traces through the end of the LMP1 coding sequence. The other 8 sequences were amplified with primers that did not include the last 8 amino acids of LMP-1, and this portion has been excluded from their analysis. However in all 55 traces, the TRADD/RIP binding motif at the C terminal end of CTAR2 was 100% conserved in all samples.

Novel K variant of LMP-1

A previously uncharacterized LMP-1 variant was observed in both eBL patients and healthy controls. This variant always differed from the B95.8 sequence at 5 amino acids: G318K, Q322E, Q334R, L338S, and S366T; and was frequently found with H352R (52.4%). We have named the novel variant K for Kenya and for the novel lysine substitution at amino acid 318. The prototypical K variant was found in 9 (24.3%) eBL sequences and 2 (8.3%) healthy controls (p=0.18, OR 3.54, 95% CI 0.69-18.07). The atypical K variant containing H352R was found in 6 (16.2%) eBL sequences and 4 (16.7%) healthy controls (p=1.00, OR 0.97, 95% CI 0.24-3.87). When the prototypical K variant was combined with atypical K variant sequences for analysis there was no difference in frequency between eBL sequences and controls (p=0.27, OR 2.05, 95% CI 0.66-6.36).

LMP-1 T cell epitope variants

Duraiswami and colleagues showed that only specific LMP-1 epitopes are able to elicit interferon-γ production from T cells [34]. One of these epitopes occurs in CTAR3, from amino acids 307 to 323. Within this region it was determined that there were two minimal sequences of 9 amino acids necessary for recognition by EBV-specific T cells. The minimal T cell epitope sequences within CTAR3 were AGNDGGPPQ and PSDSAGNDG. When the K sequence was mapped onto these epitopes, it was found that the K variant was mutated at the C terminal amino acid of both minimal T cell epitopes, creating sequences AGNDE GPPK and PSDSAGNDE. A diagram of the possible effects of these mutations on MHC-I loading is shown in Figure 5. The G318K mutation was highly linked to the Q322E mutation, such that all 22 sequences observed containing G318K also contained Q322E.

Figure 5
figure 5

Diagram of minimal T cell epitopes in CTAR3 of wild-type EBV and mutations in K variant LMP-1. Highlighted are how known peptides fit into MHC-I and possible effects of mutations on MHC processing.

An amino acid mutation at Q322 in the C terminal of the T cell epitope was detected in 55 of 61 samples analyzed. While all K variant sequences contained two amino acid mutations in the T cell epitope region of CTAR3, all but two other sequences with mutations in this region harbored mutations only in Q322. Of the two sequences with multiple T cell epitope mutations, one was an alternate C variant sequence (BL36), with mutations in both terminal amino acids, to AGNDGGPSN. The other was a B variant sequence (C2), and contained the sequence AGNDN GPPE.

Discussion

The main goals of this study were to determine the genetic variation of the C terminus of LMP-1 in children residing in western Kenya, whether variation was linked to eBL versus healthy controls, and what LMP-1 variation suggests about EBV biology. To address the first goal of our study, the LMP-1 sequences obtained from Kenyan study participants were compared to previously reported sequences from healthy Caucasians [12]. The major LMP-1 sequences observed in the Kenyan population were the C variant and a previously unreported K variant sequence. We are unaware of any previous studies describing the characteristic G318K mutation of the K variant sequence. Other LMP-1 variants observed included the B, D, and B95.8. No A variant sequences were observed among this population from western Kenya, in contrast to the high prevalence observed in the European population [12]. This general pattern of EBV variants could suggest historical movement of EBV among populations [11]. For example, the A variant virus in the European population may have arisen independently of mutation in the African setting. Further studies using larger regions of the EBV genome and sequences from diverse geographical regions are necessary to validate these observations across the global population.

The second aim of this study was to determine if certain LMP-1 genotypes were associated with eBL as compared to healthy controls. None of the previously characterized LMP-1 variants observed were associated with eBL, including B, C, D, and B95.8. The novel K variant LMP-1 was found in 40.5% of eBL sequences and 25.0% of healthy controls (p=0.27). Larger sample sizes are needed to confirm whether K variant LMP-1 is associated with eBL in Kenya. Still undetermined is whether the K variant sequence is associated with eBL in other areas endemic for Burkitt lymphoma, which would support an immune evasive phenotype of K variant LMP-1, or if it arose independently in the Kenyan population. The selection of EBV genetic variants in cancer agrees with previous work suggesting that EBV-associated Hodgkin’s disease selects for certain LMP-1 variants, which differ from the distribution of variants in the general population [35]. Similarly in eBL, previous work on EBNA-1 has suggested that certain EBNA-1 variants are more oncogenic than others [27]. Although some research has suggested the selection of specific EBNA-1 genetic variants in lymphomas, other work has suggested that specific EBNA-1 variants are associated only with geographic areas and not with eBL [28].

T cell control of EBV is critical for the development of protective immunity [40]. It was recently confirmed in a mouse model that T cell control of LMP-1 is necessary for inhibiting lymphomagenesis [41]. It has also been determined that only specific LMP-1 epitopes generate interferon-γ responses from T cells [34]. The possible link to T cell immune evasion in K variant LMP-1 derives from the mutated anchor residues in the C terminal binding groove of both of the two known minimal T cell recognition sequences of CTAR3 in the K variant. In addition to their specific location within the anchor position, these mutations resulted in changes in the polarity of the amino acid. The first mutation was from the small and uncharged glycine at position 318 to larger and positively charged lysine. The second mutation at amino acid 322 was from uncharged glutamine to negatively charged glutamic acid. Mutations in the C terminal binding groove affect the ability of peptides to be loaded onto appropriate MHC class I molecules [42, 43], so these mutations may play an important role in MHC loading, decreasing the ability of LMP-1 derived peptides to be presented at the cell surface. Our study did not evaluate the MHC specificity of these variants, but the Kenyan population has very high MHC heterogeneity [44], and it is possible that people with certain MHC variants are unable to present these novel LMP-1 peptides. Functional studies are necessary to characterize the MHC specificity of the novel LMP-1 variants identified in this study.

Given the immune evasion hypothesis it is interesting that we did not observe a difference in the frequency of K type LMP-1 between eBL patients and controls. There are multiple possible explanations for this. One possiblity is that the sample size of the current study was too small to detect a difference between these populations. Sampling a larger population was unfortunately not possible for this study. Another possibility is that LMP-1 variants of eBL patients and controls differ in critical T cell epitopes outside of the region sequenced here. It is known that LMP-1 T cell epitopes exist outside of CTAR2 and that amino acid variation leads to functional consequences [34], so this remains a possibility that should be examined by future studies. Another possibility is that K type LMP-1 in healthy individuals clusters spatially with high-risk eBL clusters [37, 45]. Spatial data were not recorded in the current study, possibly altering the frequency of K type LMP-1 that would be observed in high versus low risk healthy controls. We believe that future studies including the entire coding region of LMP-1 with larger sample sizes will help resolve this apparent discrepancy.

A major limitation of this study was that LMP-1 was sequenced from DNA extracted from peripheral blood lymphocytes rather than eBL tumor tissue. We were unable to obtain biopsy tissue for these studies. However previous work showed that EBV isolated from eBL biopsy samples contained the same EBNA-1 sequence as EBV obtained from peripheral blood of the same individual, indicating that tumor and peripheral blood EBV isolates were genetically identical [28].

Conclusions

The C-terminus of LMP-1 was sequenced from peripheral blood of eBL patients and healthy controls in western Kenya. The Kenyan population demonstrated an altered distribution of LMP-1 variants compared to previous studies in Europe. A previously undocumented LMP-1 variant was also observed, called K for Kenya and its novel lysine (K) substitution. The K variant LMP-1 is characterized by amino acid mutations in the C terminal anchor residues of both minimal T cell epitopes of LMP-1 CTAR-3, which may lead to functional differences in MHC loading. The K variant was found at increased frequency in eBL patients compared to healthy controls. Since this variant has not been described in eBL samples previously, larger patient populations will need to be studied to confirm the linkage between K variant and eBL development. Future studies are also needed to confirm the functional role of K variant mutations on MHC loading and T cell immune evasion.

Methods

Samples

Endemic BL patients were enrolled when presenting to the New Nyanza Provincial General Hospital in Kisumu, Kenya and healthy controls were enrolled from a nearby malaria holoendemic area as previously described, [46]. Additional controls (C17-C24) were included from a subset of samples of a separate study of healthy children living in a nearby area of Kisumu, Kenya [38]. After obtaining informed consent, approximately five milliliters of peripheral blood was drawn from children with eBL and healthy controls. Whole blood was frozen at -80°C until use. From these frozen samples, 38 eBL patients and 22 healthy controls were randomly selected for sequencing. After beginning the study it was pathologically determined that 2 eBL patients (BL16 and BL39) had non-eBL tumors and their sequencing data were excluded from analysis.

Ethical approval

Ethical approval was obtained from the Institutional Review Boards at The State University of New York Upstate Medical University (Rochford), The University of Massachusetts Medical School (Moormann), and the Ethical Review Committee at the Kenya Medical Research Institute, Nairobi, Kenya. Parents of minor study participants provided individual, written informed consent in accordance with the Declaration of Helsinki.

DNA extraction

DNA was extracted from whole blood using the QIAamp DNA Mini Kit (Qiagen, Germantown, MD, USA) according to the manufacturer’s instructions.

PCR amplification

The LMP-1 segment spanning the 3′ T cell epitope and JAK binding site of CTAR3 as well as CTAR2 was amplified using the following primers of sequence NC_007605.1: 5′-GCGACTCTGCTGGAAATGAT-3′ (167912-31) and 5′-GACATGGTAATGCCTAGAAG-3′ (167672-91). For control samples C17 through C24, primers were 5′-CCGTGGGGGTCGTCATCATC-3′ (167730-49) and 5′-CTCCCGCACCCTCAACAAGC-3′ (168262-43). Primers were acquired from Integrated DNA Technologies (Coralville, IA, USA). Each PCR reaction mixture contained 2.5 μl 10× PCR Buffer, 2.5 μl dNTP mixture, 1.25 μl RedTaq Polymerase (Sigma, Saint Louis, MO, USA), 2.5 μl LMP-1 forward and reverse primers at 3 uM, and 11.75 μl molecular grade water (Mediatech, Herndon, VA, USA). The amplification procedure consisted of a 95°C denaturation step for 10 minutes, followed by 50 cycles of 95°C for 30 seconds, 58°C for 30 seconds, and 72°C for 45 seconds. Reactions were carried out in an iCycler thermocycler (BioRad, Hercules, CA, USA). Positive control DNA was amplified from the EBV positive cell line B95.8. PCR product size was confirmed by gel electrophoresis using a 2% AquaPor agarose (National Diagnostics, Atlanta, GA, USA) gel containing 5% ethidium bromide (Sigma, Saint Louis, MO, USA) at 10mg/ml.

Cloning

After confirming the appropriate product length, PCR products were cloned using the TOPO TA pCR 2.1 cloning kit with TOP10 chemically competent Escherichia coli according to the manufacturer’s instructions (Invitrogen, Carlsbad, CA, USA). Five clones per sample were selected and run on an agarose gel to visualize the presence of the LMP-1 product and the size of the amplicon.

Plasmid DNA was purified from E. coli using a Qiagen Plasmid Purification Mini Kit (Germantown, MD, USA) according to the manufacturer’s instructions and eluted in HPLC grade water. To confirm the presence of the LMP-1 insert, plasmid DNA was digested with EcoR1 (New England Biolabs, Ipswitch, MA, USA) according to the manufacturer’s instructions. A total of 5 clones per sample were digested. Digestion products were run on a 2% agarose gel as described above to confirm the presence of LMP-1 insert DNA.

Sequence analysis

Plasmids containing cloned LMP-1 PCR products were sent to Genewiz (South Plainfield, NJ, USA) for sequencing using M13R universal primers. Sequences were aligned using Unipro UGENE software (Novosibirsk, Russia).

Statistical analysis

Fisher’s exact test with odds ratios (OR), and 95% confidence intervals (95% CI) in GraphPad Prism, version 5.0b (La Jolla, CA, USA) were used to compare the frequency of LMP-1 variants between eBL patients and healthy controls.