Background

Epstein-Barr Virus (EBV) is a herpesvirus that infects 90–95% of humans, causing lifelong infection [1, 2]. EBV infection during childhood is generally asymptomatic, however acquisition of EBV during adolescence or early adulthood often causes infectious mononucleosis (IM), [3] which can cause substantial morbidity during important educational periods in adolescents and young adults [4, 5]. EBV is associated with 1% of global cancers, particularly Hodgkin’s lymphoma, Burkitt’s lymphoma, nasopharyngeal cancer and gastric cancer [6].

EBV infection is not currently treatable nor preventable by vaccination; however, vaccine candidates are in development. In phase II trials, a first-generation vaccine administered to healthy seronegative volunteers aged 16–25 years demonstrated protection against IM but not EBV infection [7]. Second-generation vaccines elicited higher levels of antibody responses in animal models, [8] and first-in-human trials are likely to begin soon. Mathematical modelling of different vaccination strategies is essential to determine the effectiveness and cost-effectiveness of different vaccination strategies for reducing rates of EBV infection, IM, and EBV-associated cancers, taking into account factors such as vaccine efficacy, duration of protection and differing outcomes according to age at infection.

A greater understanding of EBV epidemiology, including the dynamics of EBV infection in different sub-populations, is necessary for the development of such models. EBV seroprevalence increases with age; 90–95% of people globally are infected by age 25, whilst 5–10% remain seronegative throughout life [9]. The best public health strategy for the deployment of an infection-preventing vaccine may vary between settings; infection appears to occur at younger ages in resource-limited countries and thus children will need to be vaccinated early [10,11,12]. However, if the duration of vaccine-induced protection is not lengthy, vaccinated individuals may become susceptible to natural infection at an age where the consequences of infection are more severe, for example leading to IM or cancer [13].

Additionally, sub-optimal vaccine coverage even of a vaccine with a long duration of protection will lead to a higher age at infection amongst those who remain unvaccinated. In such situations it may be better to delay vaccination until the pre-teenage years, targeting individuals who remain EBV seronegative. Alternatively, a vaccine protecting against IM and EBV-associated diseases (such as certain cancers) could be administered to older children as they approach adolescence, which may be effective even with a shorter duration of protection. After the licensing of vaccine candidates, strategic discussions will need to take place nationally and be informed by accurate national data on the epidemiology of EBV infection.

In the United Kingdom, EBV seroprevalence increases rapidly in very young children, reaching 21 and 51% by the age of two years in children of white and Pakistani ethnicity, respectively [14]. Another study showed that EBV seroprevalence then remained relatively constant, at around 55%, between the ages of five and 11 years [15]. EBV seroprevalence was estimated at 75% in university students at 19 years and 92% by the age of 22 years [16]. We recently published summary data on the seroprevalence of EBV in adolescents in England [13]; however, to date no study has investigated factors associated with seropositivity that could inform a targeted vaccination strategy.

Our aim was to investigate the sociodemographic and lifestyle factors, particularly age, associated with EBV serostatus in children and young adults in England, and to discuss the implications of our findings for future EBV vaccination policy.

Methods

Study population

The Health Survey for England (HSE) is an annual, cross-sectional, representative survey of households in England. Its methods are described in detail elsewhere [17]. For this study, and in order to parameterise a model of EBV transmission, [13] we randomly selected individuals who participated in the 2002 HSE; 2002 was the most recent year in which survey participants gave consent for future studies to test their blood samples for blood-borne viruses. Our aim was to include 25 participants of each sex in each single year age group from 11 to 24 years, in order to fill a gap in the literature and capture the years at which infection is most likely to have clinical consequences. The participant IDs were selected randomly by the HSE, however it was not possible at the time of sampling to determine whether the samples had already been used. As a result, more than 25 IDs were selected for each age-sex group to ensure there were sufficient samples for our analysis, and therefore there are not exactly 25 samples in each group (Additional file 1: Table S1).

Measuring seroprevalence of Epstein-Barr virus and cytomegalovirus infection

Stored blood serum samples collected between January 2002 and March 2003 were obtained from the HSE. Samples were posted to the laboratory within two days, where they were centrifuged, and the remaining serum was frozen and stored at − 40°c until they were analysed, which was completed in September 2017 [18].

EBV virus capsid antigen (VCA)-specific IgG and CMV-specific IgG were detected in serum samples using commercial ELISA kits obtained from EUROIMMUN, Germany (EI2791–9601-G, EI2570-9601G). Assays were performed according to manufacturer’s instructions and serum antibody concentrations were calculated using a standard curve. Data on the performance of the assays are detailed in Additional file 1: Table S2. Results were presented in relative units (RU/mL); <16RU/mL samples were considered negative, ≤16 to <22RU/mL borderline and ≥ 22RU/mL positive. Borderline results from the EBV VCA IgG ELISA were subsequently subjected to re-analysis with an EBV immunoblot assay (EUROIMMUN, Germany, DY2790G) which revealed all borderline serum samples (n = 5) had reactivity to alternative EBV antigens; they were therefore considered seropositive.

Statistical analysis

Data were analysed in Stata version 15.0. We weighted our sample, using the svy commands in Stata, to be representative of the English population in 2002 with respect to age and sex, utilising data from the Office for National Statistics [19]. All stated percentages are weighted. Descriptive analyses of the study population were undertaken. ArcMap 10.3.1 was used to create a map of EBV seroprevalence by English Government Office Region [20].

To investigate factors associated with being seropositive for EBV, we undertook logistic regression modelling. A causal inference framework was used to determine a priori factors to be included in multivariable models, from the available data collected in the HSE. We built two multivariable regression models.

A ‘whole-population’ model, which included our entire study population, examined the following factors: age, sex, ethnicity (categorised as ‘white’ or ‘other’ due to small numbers of non-white participants), body mass index (BMI; categorised as ‘underweight’ [BMI < 20], ‘healthy weight’ [20-<25], ‘overweight’ [25-<30]or ‘obese’ [≥30]), region of England and CMV serostatus.

A second ‘adults-only’ model was restricted to individuals aged ≥16 years, and additionally included factors for which data was only available for adults; smoking status (never smoked, current smoker, smoked in past) and occupational category from the National Statistics Socio-economic classification (NS-SEC) [21]. The NS-SEC categorises occupations into higher managerial and professional roles (involving strategy/supervision), intermediate occupations (typically clerical, sales, service or technical positions which do not involve general planning or supervision), routine and manual occupations (involving basic labour), never worked or long-term unemployed, and other. We excluded individuals missing data on one or more variables.

Planned sensitivity analyses investigated the impact of excluding CMV serostatus as a predictor of EBV serostatus, and the impact of classifying the originally indeterminate serological results as seronegative rather than seropositive.

Ethical approval

This study was approved by the University College London Research Ethics Committee (5683/002). The HSE obtained informed written consent for blood samples to be collected and stored for future analyses [17].

Results

Our study sample included 732 individuals aged 11–24 years, of whom 547 (74.6%) were EBV-seropositive. The characteristics of seropositive individuals are shown in Table 1.

Table 1 The number and weighted percentage of individuals seropositive for EBV in England in 2002

EBV serostatus was associated with CMV serostatus; 72.6% of CMV-seronegative individuals were EBV seropositive compared to 80.9% CMV-seropositive individuals (χ2 test P = 0.04, Table 1). Considerable variation in EBV seroprevalence was observed by UK region (Fig. 1, Table 1). EBV seropositivity increased with age, from 39.6% at 11–14 years to 93.0% at 22–24 years (Fig. 2).

Fig. 1
figure 1

Weighted Epstein-Barr virus seroprevalence by English Government Office Region in 2002. Contains National Statistics data© Crown copyright and database right [2011] Contains public sector information licensed under the Open Government Licence v3.0

Fig. 2
figure 2

Weighted seroprevalence of Epstein-Barr virus by age in England in 2002. CI: confidence interval, EBV- Epstein Barr Virus

Factors associated with EBV seropositivity were largely consistent between the univariable and multivariable models (Table 2). Increasing age was associated with increased EBV seroprevalence (adjusted odds ratio [aOR] 9.16 [95% confidence interval (CI) 4.38–19.14] for people aged 22–24 years compared to those aged 11–14 years), as was non-white ethnicity (aOR 2.33 [1.13–4.78]). CMV seropositivity was associated with EBV seropositivity in the ‘adults-only’ multivariable model (aOR 2.16 [1.05–4.43]) but not in the ‘whole population’ model (aOR 1.25 [0.79–1.98]).

Table 2 Univariable and multivariable logistic regression models of factors associated with Epstein-Barr Virus seropositivity in England in 2002

Among adults, EBV seropositivity was higher among those who currently smoked (aOR 4.29 [2.13–8.65]), than those who had never smoked. There was no evidence of associations between sex, BMI or occupational category and EBV serostatus.

In sensitivity analyses, we firstly excluded CMV serostatus as a predictor of EBV serostatus, and secondly we classed indeterminate serology results (n = 5) as seronegative rather than seropositive. Both sensitivity analyses showed results consistent with our main analyses (Additional file 1: Table S3, Table S4).

Discussion

The importance of EBV as a cancer-causing pathogen has generated international interest in developing an anti-infection vaccine [22]. The cost-effectiveness of different strategies to deploy such vaccines will vary from setting to setting and is dependent on the epidemiology of the infection. For example, EBV’s association with IM means that vaccines that do not produce lifelong immunity may be better targeted towards subgroups which are likely to acquire infection in adolescence. In this observational study of factors associated with EBV seroprevalence among young people in England in 2002, we explored the distribution of seroprevalence by age and the sources of additional variability. We found a substantial increase in EBV seroprevalence with age among our sample population, associations with ethnicity and smoking, and a potential association with CMV seroprevalence.

A series of studies have demonstrated that EBV is generally acquired pre-adulthood, and that this varies between settings [12]. Our findings regarding smoking fit with the prevailing narrative that there is an association between EBV and socioeconomic status, rather than smoking being an independent risk factor [12]. Unfortunately, we did not have a good measure of socioeconomic status in our analysis; the NS-SEC does not account for familial socioeconomic status during childhood, which is probably more relevant to EBV seroprevalence than individual occupational status in young adults, and we were unable to measure socioeconomic status in children at all.

We found that EBV prevalence varied substantially between regions of the UK in univariable analyses and in the whole-cohort model, but not in the adults-only model, suggesting confounding between region and socioeconomic status. There was also a strong association between EBV seropositivity and ethnicities other than white, in both univariable and multivariable models. This may be the result of different mixing patterns (as people of ethnic minorities are more likely to live in larger households), different feeding practices, or residual confounding of socioeconomic status. CMV is another herpesvirus which infects a high proportion of the population from a young age, [23] and has also been associated with EBV in other settings [24, 25].

In England, EBV infects 55% of the population by the age of 12 [15]; i.e. prior to adolescence, when the risk of IM increases. Cost-effective deployment of a cheap, infection-preventing, vaccine with a lifelong duration of protection could thus likely involve targeting the early years. However, future vaccines may produce a shorter duration of immunity, potentially delaying infection and resulting in an increasing incidence of IM (and IM-associated cancers). This could be compounded by sub-optimal vaccine coverage increasing the average age at infection [26] and consequently potentially increasing rates of IM – similarly to how sub-optimal coverage of the MMR vaccine led to an increase in congenital rubella syndrome in Greece [27, 28].

In such a scenario, targeted vaccine deployment to the social groups who acquire infection later (when the likelihood of IM is higher) might be considered, possibly with repeated dosing if required. Such targeting could be informed by the risk factors detected within this analysis, and data such as those presented here should be considered in conjunction with the characteristics of the vaccine available when determining what a vaccine policy should look like. If a vaccine was cheap and effective, then universal coverage would be appropriate. If the duration of protection was short, it may be prudent to give repeat doses of the vaccine to people who pick up the infection at the youngest age, which is linked to ethnicity and likely to socioeconomic status. The use of an expensive vaccine could be stratified on the basis of who is most likely to suffer EBV-related disease after infection, which we have studied separately [29].

The limitations of our work include the age of the data and the use of a cross-sectional study design, preventing determination of the temporality of the correlation between EBV and CMV infection. In our analysis, EBV seroprevalence was higher than CMV seroprevalence in all age groups, and both increased with age. We found that CMV was associated with EBV in univariable analyses, and in the adults-only model, but not in the whole-cohort multivariable model. As both EBV and CMV are associated with increasing age, particularly during adolescence, we would not expect an association between CMV and EBV to persist in the whole-cohort multivariable model. It is possible that as the association between age and EBV seroprevalence was less strong in the adults-only multivariable model (as EBV seroprevalence starts to saturate as people reach adulthood), there was enough of a residual effect that the association between EBV and CMV could be detected. Unfortunately, our sample size was not large enough to investigate the interactions between EBV, CMV and age in more detail. The association may result from shared genetic, immunological and/or sociodemographic risk factors, or one infection could increase susceptibility to the other. Longitudinal studies with serial testing are necessary to explore this association, and additional risk factors, in more detail.

We elected to measure IgG antibodies to the EBV VCA protein and whole CMV virus, as these antibodies are present in all infected individuals and persist for life. Although we did not test for IgM antibodies, and cannot exclude the possibility that some seronegative individuals may have been recently infected, we note that VCA-specific IgG and IgM antibodies usually appear contemporaneously [30] and therefore we would expect the number of such individuals in our study to be low.

Conclusions

Knowledge of the distribution of EBV infection among young population groups in England is critical for determining future vaccination policies, including the cost-effectiveness of general versus selective approaches. Data such as those presented here should be used together with detailed information on vaccine characteristics, the implications of remaining EBV-uninfected for life, the ramifications of delayed infection, and the financial costs of IM and EBV-associated cancers to inform such policies.