DATA MODEL INTEGRATION: THE GLOBAL EPIDEMIC AND MOBILITY FRAMEWORK
- 552 Downloads
GIVEN A SET OF INITIAL CONDITIONS for the local outbreak of a new potentially pandemic pathogen, the timeline of the arrival of the epidemic in each country is mainly determined by the human mobility network that couples different regions of the world. By looking at individual countries or a given continent in isolation, any estimate of the epidemic timeline would be based on assumptions about imported cases from the rest of the world. Human mobility patterns are hence key to consistently simulating the mobility of infectious individuals on the global scale, and thus providing ab- initio estimates of the epidemic timeline in each country or urban area without assumptions on case importation.
An individual-based stochastic mathematical model of the infection dynamics
Real-world data on the mobility of this population
Real-world data on the global population
The real-world population and mobility data discussed in the previous chapter are used to determine when and where people will interact and potentially transmit the infection. In order to do that, GLEAM divides the world into a grid of small square cells. Satellite and census sources are used to calculate the population density in each of these cells, which are then clustered into subpopulations centered on their nearest transportation hub.
GLEAM simulates human mobility and disease spreading in a sequence of time steps (representing full days). Within each population cluster, the spread of the infection among individuals is governed by the characteristics of the disease and the containment and mitigation responses specified in the epidemic model. The disease is transmitted between population clusters when people commute to work or school or travel longer distances on national and international flights. On high-performance computers, GLEAM executes millions of stochastic simulations, making it possible to generate for each population the statistical ensemble of possible epidemic evolutions and analytics for quantities such as newly generated cases, seeding events, time of arrival of the infection, and others.
BUILDING A SYNTHETIC WORLD
MOVING PEOPLE AROUND
The spatio temporal patterns of the disease spreading are associated with the mobility flows that couple different subpopulations. These flows constitute the mobility data layer that is represented as a network of connections among subpopulations. This identifies the number of individuals going from one subpopulation to the others. The mobility network is made by different kinds of mobility processes, from short-range commuting to intercontinental flights with time scale and traffic volumes that span several orders of magnitude. The airline system layer integrates air travel mobility, containing the list of worldwide origin-destination flows between airport pairs on a daily schedule. Individuals travel on airplanes according to an explicit dynamic that considers the probability for each individual in the population to travel on a specific route.
The short-range commuting mobility of individuals is simulated by an effective approach that defines mixing subpopulations and which identifies the number of individuals Nij(t) of the subpopulation i effectively present in subpopulation j at time t (see INFOBOX 3.1 ). This methodology assumes the subpopulation i as having an effective number of individuals Nij << Nii in contact with the individuals of the neighboring subpopulation j in a quasi-stationary state, reached whenever the time scale of the epidemic spreading is larger than the commuting rate.
3.1 INFOBOX: MIXING SUB POPULATIONS
In the case of commuting flows, we assume that individuals in the subpopulation i will visit anyone of the connected subpopulations with a per capita diffusion rate σi. As we aim at modeling commuting processes in which individuals have a memory of their location of origin, displaced individuals return to their original subpopulation with rate T.
This implies that in the regime σi << τi, Ni (t) represent a small perturbation to the overall subpopulation of size Nj. These expressions are used to obtain the effective force of infection taking into account the interactions generated by the commuting flows.
THE DISEASE DYNAMIC
3.2 INFOBOX: MODELING THE DISEASE TRANSMISSION
The I → R transition is obviously the simplest one to model. For many type of diseases, the amount of time spent in the infectious class is distributed around a well-defined mean value. For the sake of realism, the probability that one person will move from the I class to the R class depends on how much time he/she has spent in the I class. The distribution of the “infectious period” and the transition probability can generally be estimated from clinical data; however, from a simplistic modeling assumption, the probability of transition is assumed constant. In this way it is possible to define a transition probability per unitary timestep μ, called the recovery probability. Since we are dealing with a probability per unit time, the time an individual will spend on average in the infectious compartment, the mean infectious period, is equal to μ–1.
The simplest approaches consider chain binomial processes in which the discrete individuals are indistinguishable and characterized only by their compartmental state. These models can be made more realistic by including age structure or other features of the individuals. In this case the transmission of the disease is described by parameters that depend on those features. An example is provided by models implementing specific contact matrices that characterize the number of contacts among individuals in different age brackets.
At the finer level, synthetic population constructions are even more refined and consider a classification of location such as households, schools, offices, etc. The movements and time spent in each location can be used to generate individual-location bipartite networks whose unipartite projection defines the individual level, synthetic interaction network that governs the epidemic spreading. Also in this case, although the model underlying the computational approach is a network model, each individual is annotated with the residence place, age, as well as many other possible demographic information that can be exploited in the analysis of the epidemic outbreak. Detailed synthetic populations thus reconstruct a statistically equivalent picture of the actual population down to the level of the granularity of the data available.
For the sake of simplicity, let us consider here the example of a homogenous mixing approximation which assumes that individuals randomly interact among them. According to this minimal framework, the larger the number of sick and infectious people among one individual’s contacts, the higher the probability of infection transmission. This readily translates in the definition of the force of infection λ, also called risk, that expresses the probability per unit time at which susceptible individuals may contract the infection. In the limit of small risk, it is possible to derive the explicit form λ=βIt/N. Here β defines the transmissibility, average number of transmissions per unit time, that depends on the specific disease as well as the contact pattern of the population, It the total number of infectious individuals at time t; It/N is therefore the density of infected individuals in the population. This form of the force of infection is called the mass action law and is used in many other reaction-diffusion problems in chemistry and physics. It is important to note that the force of infection is said to be frequency dependent as it assumes that the number of contacts is independent of the population size. Therefore, the force of infection depends only on the density of infectious individuals, and decreases for larger populations all the other factors being equal. This is indeed an assumption that fits with our intuition as the probability of getting infected by one single infectious individual in a city like Paris with about two million residents is necessarily much lower than the probability to be infected by the same infectious individual in Bloomington, Indiana, a campus town with only 80,000 residents.
In order to translate the above formal relation into an explicit equation, we can define the variables St, It and Rt denoting the number of individuals in the susceptible, infectious, and recovered compartment at time t, respectively. Given the assumption that μ and β are constant, we can easily define the associated stochastic processes that relate the stochastic variables at time t with the variables at time t+1 in the form of a simple binomial model of transmission for discrete contacts and discrete time. Each susceptible individual has a probability λt=βIt/N to contract the disease and transit to the infectious state.
As we assume to have St-independent events occurring with the same probability, the number of new infected individuals I+ generated at time t+1 is a random variable that will follow the binomial distribution I+~Bin(St, λt). The binomial distribution provides the probability that among the St-independent trials with probability λt, we have y positive events at time t+1. Analogously, the number of new recovered individuals at time t+1 is a random variable that will follow a binomial distribution R+~Bin(It, μ), where the number of independent trials is given by the number of infectious individuals It that might recover and the probability of recovery in a timestep is given by the recovery probability μ. If we consider a specific value of the stochastic variables St, It and Rt, the stochastic equations regulating the behavior of the epidemic can be written as:
st+1 = st – Bin(st, λt)
it+1 = it + Bin(st, λt) – Bin(it, μ)
rt+1 = rt + Bin(it, μ),
where Bin(st, λt) and Bin(it, μ) are two random variables distributed according to the respective binomial distribution. It is worth remarking here that the unitary time step defines an actual time scale ∆t and that the transition probability must be defined as a function of this time scale.
In the SIR model it is possible to readily calculate the basic reproduction number explicitly as R0= β/μ. It is given simply by the transmissibility times the average duration of the infectiousness of the single individual; this provides the average secondary infections per infectious individual.
In such a computational approach, we deal with stochastic systems, and therefore we need to generate random variables according to the specified probability distributions defined in the model. In a stochastic simulation, each sequence of random values is generated through a random number generator. Each different random input therefore provides a single stochastic instance of the system’s behavior. In the case of epidemic models, each stochastic realization will represent only one of the many possible epidemic outcomes that the same model with the same initial conditions and parameters can generate. A careful analysis of the quality of the random number generator used is advisable in all intensive large-scale computational applications.
The simple example discussed here has to be generalized to the more complicated compartmental structures used by GLEAM for the realistic modeling of infectious diseases. In many cases this implies the use of more advanced mathematical constructions and the use of multinomial stochastic processes.
Ranges for the reproduction number of some infectious diseases
2 – 3
2 – 5
5 – 10
1.5 – 3.5
1.5 – 3.5
GLEAM defines a synthetic world in which we can simulate with the computer the unfolding of epidemics and pandemics. Each simulated time step represents a full day. The model needs the definition of the initial conditions that specify the number and location of individuals capable of transmitting the disease. At the start of the time step, we use the flight network to move travelers to their destination.
GLEAM also allows the introduction of seasonal variations in the transmissibility of the disease, such as in the case of influenza. Seasonality effects are still an open problem in the transmission of ILI. In order to include the effect of seasonality on the observed patterns of ILI, a standard empirical approach can be used in which seasonality is modeled by a forcing that reduces the basic reproduction number by a factor ranging from 0.1 to 1 (no reduction). This forcing is described by a sinusoidal function over a 12-month period that reaches its peak during winter time and its minimum during summer time in each hemisphere, with the two hemispheres at opposite phases. The minimum rescaling of αmin of the reproduction number is a free parameter to be estimated from data. For scenario purposes it is possible to consider a mild seasonality and a strong seasonality scenario, with αmin ~0.5 and αmin ~0.1, respectively.7
Given the population and mobility data, infection dynamics parameters, and initial conditions, GLEAM performs the simulation of stochastic realizations of the worldwide unfolding of the epidemic. From these in silico epidemics, a variety of information can be gathered, such as the prevalence, morbidity, number of secondary cases, number of imported cases, hospitalized patients, amounts of drugs used, and other quantities for each subpopulation with a time resolution of 1 day. In the next chapter, we will see the results of the numerical simulations and why and how they can be useful to our analysis and understanding of infectious disease spreading.
3.3 INFOBOX: DISEASE COMPARTMENTAL STRUCTURE
GLEAM labels individuals in each population according to the compartment describing the state of the disease and the possibility to travel, commute, etc.
A susceptible individual in contact with a symptomatic or asymptomatic infectious person contracts the infection at rate β or rβ β, respectively, and enters the latent compartment where he is infected but not yet infectious.
Influenza with antiviral pharmaceutical interventions
SARS-Like viruses and their non-pharmaceutical containment
The population of each city is classified into seven different compartments, namely, susceptible, latent, infectious, hospitalized who either recover or die, dead, and recovered individuals. We assume that hospitalized, as well as infectious individuals are able to transmit the infection, given the large percentage of the cases that were seen among healthcare workers. The actual efficiency of hospital isolation procedures is modeled through a reduction of the transmission rate β by a factor rβ = 20%, as estimated for the early stage of the epidemic in Hong Kong. The infectiousness of patients in the hospitalized compartments HR and HD are assumed to be equal (although this assumption can easily be changed in the model). Susceptible individuals exposed to SARS enter the latent class. Latents represent infected individuals who are not yet contagious and are assumed to be asymptomatic, as suggested by results based on epidemiological, clinical, and diagnostic data in Canada. They become infectious after an average time ε–1 (mean latency period). Individuals are classified as infectious during an average time equal to μ–1, from the onset of clinical symptoms to their admission to the hospital where they eventually die or recover. Patients admitted to the hospital are not allowed to travel. The average periods spent in the hospital from admission to death or recovery are equal to μD–1 and μR–1, respectively. The average death rate is denoted by d.
Viral hemorrhagic fever compartmental model
Susceptible individuals, after contact with an infectious individual (I, H or F), enter the latent class at a rate βI, βH, or βF. At the end of the latency period α–1, each individual becomes infectious. Infectious individuals then can transition to the hospitalized, funeral, or removed compartments according to different parameters. Similarly, from the compartment hospitalized and funeral, individuals can enter the removed compartment. The mean duration from onset of symptoms to hospitalization is γh–1, γdh–1 is the mean duration from hospitalization to death, and γi–1 denotes the mean duration of the infectious period for survivors. The mean duration from hospitalization to end of infectiousness for survivors is γih–1, and γf–1 is the mean duration from death to burial.
Duygu Balcan et al., “Multiscale mobility networks and the spatial spreading of infectious diseases,” Proceedings of the National Academy of Sciences 106, 21484–21489 (2009).
Cécile Viboud et al., “Synchrony, waves, and spatial hierarchies in the spread of influenza,” Science 312, 447–451 (2006).
Filippo Simini et al., “A universal model for mobility and migration patterns,” Nature 484, 96–100 (2012).
Duygu Balcan et al., “Modeling the spatial spread of infectious diseases: The global epidemic and mobility computational model,” Journal of Computational Science 1, 132–145 (2010).
L. Sattenspiel and K. Dietz, “A structured epidemic model incorporating geographic mobility among regions,” Mathematical Biosciences 128, 71–91 (1995).
Matt J Keeling and Pejman Rohani, “Estimating spatial coupling in epidemiological systems: a mechanistic approach,” Ecology Letters 5, 20–29 (2002).
Ben S. Cooper et al., "Delaying the International Spread of Pandemic Influenza," PLoS Med 3, e212 (2006).
Judith Legrand, Rebecca Freeman Grais, Pierre-Yves Boëlle, Alain-Jacques Valleron, and Antoine Flahault, “Understanding the dynamics of ebola epidemics,” Epidemiology & Infection 135, 610–621 (2007).