Simulation-guided design of serological surveys of the cumulative incidence of influenza infection
- 1k Downloads
Influenza infection does not always cause clinical illnesses, so serological surveillance has been used to determine the true burden of influenza outbreaks. This study investigates the accuracy of measuring cumulative incidence of influenza infection using different serological survey designs.
We used a simple transmission model to simulate a typical influenza epidemic and obtained the seroprevalence over time. We also constructed four illustrative scenarios for baseline levels of antibodies prior and levels of boosting following infection in the simulated studies. Although illustrative, three of the four scenarios were based on the most detailed empirical data available. We used standard analytical methods to calculate estimated seroprevalence and associated confidence intervals for each of the four scenarios for both cross-sectional and longitudinal study designs. We tested the sensitivity of our results to changes in the sampled size and in our ability to detect small changes in antibody levels.
There were substantial differences between the background antibody titres and levels of boosting within three of our illustrative scenarios which were based on empirical data. These differences propagated through to different and substantial patterns of bias for all scenarios other than those with very low background titre and high levels of boosting. The two survey designs result in similar seroprevalence estimates in general under these scenarios, but when background immunity was high, simulated cross-sectional studies had higher biases. Sensitivity analyses indicated that an ability to accurately detect low levels of antibody boosting within paired sera would substantially improve the performance of serological surveys, even under difficult conditions.
Levels of boosting and background immunity significantly affect the accuracy of seroprevalence estimations, and depending on these levels of immunity responses, different survey designs should be used to estimate seroprevalences. These results suggest that under current measurement criteria, cumulative incidence measured by serological surveys might have been substantially underestimated by failing to include all infections, including mild and asymptomatic infections, in certain scenarios. Dilution protocols more highly resolved than serial 2-fold dilution should be considered for serological surveys.
KeywordsInfection attack rate Cumulative incidence Seroprevalence Influenza Serological survey Cross-sectional study design Longitudinal study design Mathmatical modelling
Influenza infection does not always cause clinical illnesses and the rate of non-clinical infection most likely varies from strain to strain . Therefore, with the majority of surveillance systems based on clinical episodes, uncertainty regarding the number of unobserved infections dominated other epidemiological uncertainties during the 2009 pandemic .
Serological studies provide one option with which to resolve these uncertainties and a number were conducted (or at least initiated) during the 2009 H1N1 pandemic [3, 4, 5]. The rationale for conducting serological studies is straightforward as complimentary surveillance activity to traditional symptom-based and laboratory-based surveillance. Serological studies provide the alternative approach of monitoring immunity levels in a population and do not need to test people during a short period of time when they are symptomatic. In cross sectional serological studies, a single blood sample is drawn from members of the population and tested for the presence of high levels of antibodies to the virus of interest. In longitudinal serological studies, two or more samples are taken from members of the population and are tested for significant rises in antibodies.
In contrast to serology-based community studies, the measurement of influenza incidence in the community using PCR-based assays is not feasible because of the short time during which infected individuals shed virus. The intensity of sampling and testing would be prohibitively expensive. Serological studies also have advantages over symptom-based surveillance. Not all influenza-like-illnesses (ILI) are caused by influenza infection, nor does every influenza infection result in an influenza-like-illness. Despite the fact that not every infection results in increased antibody titres, it might be expected that assay-measured increases in effective antibody concentration are considerably less biased than symptom-based definitions such as ILI.
Despite these advantages of some alternative survey designs, serological surveys do suffer from a number of limitations. In particular, intuitively, two features of the population and strain affect the likely accuracy of a serological survey: levels of pre-existing antibodies to the strain of interest (most likely caused by cross-reactivity to prior circulating strains) and an inability of the virus to generate high levels of antibody boosting. Here, we investigate the impact of these two ecological features on the ability of two different study designs to estimate accurately the cumulative incidence of infection. Cumulative incidence based on seroepidemiological study is a measurement of seroprevalence, which quantifies the proportion of individuals whose serological specimens indicate seropositive against an infective pathogen. Unlike case prevalence that quantifies disease occurrences during a study period, seroprevalence quantifies antibody prevalence based on serological test that reflects the cumulative experience, past and recent infection, with an infectious agent.
We used a parsimonious disease-dynamic model to make a deterministic prediction of seroprevalence at a given number of days after the introduction of a novel respiratory pathogen. From this, we simulated estimates of seroprevalence from the appropriate statistical model for either a cross-sectional or a longitudinal study, based on four illustrative scenarios for the baseline level of antibodies in the population and the degree of boosting after infection.
The transmission process was modeled as a deterministic density-dependent susceptible-infected-recovery (SIR) model: A SIR model involves only three health states, namely, susceptible, infected, and recovered, in which the number of infected individuals (either at a specific instant or the cumulative occurrence) was the primary outcome. A density-dependent model simulates that the number of contacts is dependent of the susceptible population size as attack rate stays constant; whereas, demographic changes to the population such as births and deaths were not considered to be important here for the short period of interest. The model can be easily parameterised in terms of the reproductive number R and the average time between generations of infection T g .
Illustrative scenarios for baseline antibodies and boosting
We defined four different scenarios for baseline titres and boosting. The first scenario, Scenario A, was entirely theoretical and was used to demonstrate that both longitudinal and cross-sectional designs give unbiased estimates of seroprevalence under best-case assumptions. In Scenario A, we assumed that no individuals had detectible antibody titre and that all individuals underwent antibody boosts to 1:40 following infection.
Scenarios B and C were based on data from the Hong Kong Longitudinal Cohort Study . Scenario B used titre values against A/California/4/2009(H1N1) and Scenario C used titre values against A/Perth/16/2009(H3N2). No PCR data was available for either of these scenarios, so prior to any sensitivity analyses, we assumed that titre increases of 2-fold or greater were actual infections but that we could only use rises of 4-fold or greater to reliably infer infection. All study protocols of Hong Kong Longitudinal Cohort Study were approved by The Institutional Review Board of The University of Hong Kong/Hospital Authority Hong Kong West Cluster.
Scenario D was based on a group of PCR-confirmed infections from 2009 in England and Wales for which serological assay results were also available . Unfortunately, it was not possible to match baseline and follow-up titres at the level of the individual for this cohort. In this study, 1403 serum samples were collected in 2008 and 1954 serum samples were collected in 2009. We extracted the pre- and post-infection titre levels from the published paper, of which only those who had shown titre level rise were included. Then, since individuals’ boosting levels were unavailable, we estimated their boosting level as the most minimum possible according to the different combinations of pre- and post-infection titre values.
Model of immune responses
Haemagglutination-inhibiting (HI) antibody titres were represented by titre thresholds in the form of (<1:10, 1:10, 1:20, 1:40,..., 1:1280) for datasets from Riley and colleagues ; whereas, those from Miller and colleagues  were represented in (<1:8, 1:8, 1:16, 1:32,..., 1:1024). For mathematical convenience, we transformed both the baseline and post-infection antibodies onto non-negative integers, y, such that y=log2(z/A), where the actual titre threshold was 1:z and we assumed that <1:8 was equal to 1:4 (making A=4) and <1:10 was equal to 1:5 (A=5). On this scale, a four-fold difference or greater in titres corresponded to an increase of 2 or more in y.
The deterministic model provides a prediction of the cumulative incidence over time. We assumed that our serological study was of size n. We then used a simple statistical simulation model to generate the results of serological surveys. Each simulated survey was assumed to have drawn baseline blood samples at time t=0 and followup samples at time t=t f . We drew from the assumed baseline distribution of log titres for all n individuals in the simulated study. Although we considered many different values for t f , we never assumed more than a single follow-up sample was taken from any individual. The difference in cumulative incidences between times t=0 and t f gave us the proportion of the population who were infected. Therefore, we randomly assigned each individual as infected or not based on that proportion. The follow-up log titre for those not infected was assumed to be the same as their baseline log titre. For those infected, we drew a random log boosting value from the assumed log boosting distribution, added that to their baseline log titre and recorded the resulting value as their follow-up log titre.
Based on the definitions of seroprevalence of different survey designs, the seroprevalence and estimated errors can be quantified as a function of pre- and post-infection antibody levels. Specifically, in the analysis of serial cross-sectional study design, we defined seroprevalence as the proportion of individuals in the population who were seropositive after excluding the proportion of individuals in the population who were seropositive at baseline . Conversely, in the analysis of paired sera samples in longitudinal studies, seroprevalence was defined as proportion of individuals in the population that had a 2 unit of greater increase in log titre .
For influenza A, we used available data to define three illustrative scenarios for baseline titre values and boosting of titres following infection. We simulated cross-sectional and longitudinal serological studies based on these scenarios in order to assess the accuracy with which it was possible to measure the cumulative incidence of infection. We found that plausibly high levels of background immunity (perhaps due to cross-reactivity) and plausibly low levels of boosting following infection could introduce substantial biases to the estimates of cumulative incidence. Although biases were higher for cross-sectional study design than for the longitudinal study design in general, when levels of background immunity were low, there was little difference between the performances of the two designs. Sensitivity analyses indicated that an ability to detect infections from low levels of antibody boosting would substantially improve the performance of serological surveys, even under difficult conditions, i.e., when background titres are high and/or boosting after infection is low. Such conditions were observed in the elderly in Hong Kong during the 2009 pandemic.
The baseline-boosting scenarios used here were based on two different empirical study designs. Scenarios B and C drew on data from a longitudinal community-wide seroprevalence study . Therefore, the boosting assumptions for these scenarios reflect accurately the distribution of changes in antibody state during the epidemic. However, no independent data exist with which to define infection in these data so it was not possible to tease out assay variation from low-levels of infection . Conversely, the data used to define boosting for Scenario D are based on PCR-confirmed infections and therefore accurately describe boosting for the cohort of individuals from which these data were obtained . However, because these samples arose from clinical cases, they likely reflect patterns of antibody boosting among a more severe subset of infections. One way to overcome these symmetric challenges in the different data sets would be to conduct a community-wide cohort study with intense virological sampling in addition to baseline and follow-up serology.
The main purpose of the deterministic model was to produce a realistic proportion of the population who are infected between two time points. Variations in transmission parameters, such as the reproductive number R and generation time T g would be important for future survey design, but are not important for the interpretation of the simulated serological surveys.
We chose to simulate the dynamics within only a single homogeneously mixing population. Usually, serological studies of influenza in the community will be motivated by a whole set of questions of which estimating the cumulative incidence will only be one. A number of these other questions will likely relate to specific population subgroups. For example, there may be an over-representation of one age group than another in the clinical cases and it might be hoped that the serological study will help to resolve if the difference is being driven by differential rates of infection or by differential rates of becoming symptomatic. Also, having a higher proportion of school-aged children in the population would drive the epidemic to peak earlier than what shown in a homogeneous population. However, the conclusion regarding the effects of background and boosting titre levels toward the accuracy of seroprevalence measurement would have been the same if the model were age-stratified. The framework we describe here would still be useful in the design of field studies motivated by important subgroup questions as long as individual subgroups are treated as separate populations - so sample sizes and timings of follow-up would be chosen with specific types of individual in mind. Also, it would be straightforward to extend the transmission model to include multiple age groups and thus describe expected differences in the timing of peaks of infection between subgroups [4, 14].
The determination of cumulative incidence of infection within a population using serological studies are not without weaknesses. For instance, as noted, mild and asymptomatic infections may yield antibody titre below the level of minimum detection limit and seroconversion. In fact, a proportion of the pandemic H1N1 infections in 2009 were defined as seronegative following virologically confirmed infection [1, 3, 15]. Also antibody titres may be reduced in patients who were undergoing antiviral treatment . Nevertheless, this model can be extended to explore the effects of these issues if antibody titre boosted between baseline and follow-up by these scenarios are known.
Interpretation of the results of haemagglutination inhibition (HI) test and microneutralization (MN) assays may further be complicated by vaccination. Often (although not always) self-reported vaccination status is available from longitudinal studies and not from cross-sectional studies. The simulation methods we have described here could be extended to incorporate this extra uncertainty where the vaccination status of individuals is not known but where the average rate across the population is known. Although this usually applies to cross-section studies, there is no reason the potential bias could not be assessed for both studies.
Our simulations (Figure 4) showed that being able to reliably detect small increases in antibody titre could substantially improve the accuracy of longitudinal serepidemiological studies when conditions are difficult: when background titres are high and boosting after infection is sometimes low. Although recently developed novel statistical methods are able to tease apart low levels of infection from measurement error , these rely on the use of a PCR-confirmed subset of data. As already mentioned, it may be difficult to obtain these data for a representative sample of the population. Therefore, we suggest that the potential reduction in bias from a more sensitive assay illustrated in these simulation results justifies trials of dilution protocols with higher resolution than 2-fold, especially for longitudinal studies.
High levels of background titres and low levels of boosting affect estimates of cumulative incidence of influenza infection derived from seroepidemiological studies. When background immunity is high, simulated cross-sectional studies are particularly prone to higher biases. Otherwise, the two survey designs produce similar seroprevalence estimates in general. Assays capable of reliably detecting low levels of boosting after infection would greatly improve the performance of longitudinal studies when conditions are difficult.
The authors thank the reviewers for their insightful comments on the original manuscript. KMW acknowledges scholarship support from Swire Company and The University of Hong Kong. SR acknowledges: the Medical Research Council (UK, Project MR/J008761/1); the Wellcome Trust (UK, Project 093488/Z/10/Z); the Fogarty International Centre (USA, R01 TW008246-01); Fogarty International Centre with the Science & Technology Directorate, Department of Homeland Security (USA, RAPIDD program); and the National Institute for Health Research (UK, for Health Protection Research Unit funding).
- 1.Cowling BJ, Chan KH, Fang VJ, Lau LLH, So HC, Fung ROP, Ma ESK, Kwong ASK, Chan C-W, Tsui WWS, Ngai H-Y, Chu DWS, Lee PWY, Chiu M-C, Leung GM, Peiris JSM: Comparative epidemiology of pandemic and seasonal influenza A in households. N Engl J Med. 2010, 362 (23): 2175-2184. 10.1056/NEJMoa0911530.CrossRefPubMedPubMedCentralGoogle Scholar
- 4.Wu JT, Ho A, Ma ESK, Lee C-K, Chu DKW, Ho P-L, Hung IFN, Ho LM, Lin CK, Tsang T, Lo S-V, Lau Y-L, Leung GM, Cowling BJ, Peiris JSM: Estimating infection attack rates and severity in real time during an influenza pandemic: analysis of serial cross-sectional serologic surveillance data. PLoS Med. 2011, 8 (10): 1001103-10.1371/journal.pmed.1001103.CrossRefGoogle Scholar
- 5.van Kerkhove MD, Hirve S, Koukounari A, Mounts W: The H1N1pdm serology working group: Estimating age-specific cumulative incidence for the 2009 influenza pandemic: a meta-analysis of A(H1N1)pdm09 serological studies from 19 countries. Influenza Other Respir Viruses. 2013, 7 (5): 872-886. 10.1111/irv.12074.CrossRefPubMedGoogle Scholar
- 8.Riley S, Kwok KO, Wu KM, Ning DY, Cowling BJ, Wu JT, Ho L-M, Tsang T, Lo S-V, Chu DKW, Ma ESK, Peiris JSM: Epidemiological characteristics of 2009 (H1N1) pandemic influenza based on paired sera from a longitudinal community cohort study. PLoS Med. 2011, 8 (6): 1000442-10.1371/journal.pmed.1000442.CrossRefGoogle Scholar
- 9.Fraser C, Donnelly CA, Cauchemez S, Hanage WP, Kerkhove MDV, Hollingsworth TD, Griffin J, Baggaley RF, Jenkins HE, Lyons EJ, Jombart T, Hinsley WR, Grassly NC, Balloux F, Ghani AC, Ferguson NM, Rambaut A, Pybus OG, Lopez-Gatell H, Alpuche-Aranda CM, Chapela IB, Zavala EP, Guevara DME, Checchi F, Garcia E, Hugonnet S, Roth C: The WHO Rapid Pandemic Assessment Collaboration: Pandemic potential of a strain of influenza A (H1N1): early findings. Science. 2009, 324 (5934): 1557-1561. 10.1126/science.1176062.CrossRefPubMedPubMedCentralGoogle Scholar
- 15.Wu JT, Ma ESK, Lee C-K, Chu DKW, Ho P-L, Shen AL, Ho A, Hung IFN, Riley S, Ho L-M, Lin CK, Tsang T, Lo S-V, Lau Y-L, Leung GM, Cowling BJ, Peiris JSM: The infection attack rate and severity of 2009 pandemic H1N1 influenza in Hong Kong. Clin Infect Dis. 2010, 51: 1184-1191. 10.1086/656740.CrossRefPubMedPubMedCentralGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2334/14/505/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.