DATA, DATA, AND MORE DATA
- 510 Downloads
Given the decades of success achieved in weather forecasts and the phenomenal results in the computational modeling of new material properties, it is natural to wonder why we are at a very different stage in the quantitative forecasting of the next pandemic or the progression of seasonal influenza. The main difference is that spreading and contagion models start with assumptions about the behavior of humans and society instead of the physical laws governing fluid and gas masses. In other words, while it is possible to produce a complete temperature analysis of the sea surface and satellite images of atmospheric turbulence, we do not yet have large-scale knowledge of commuting patterns or precise maps of contact among individuals.
In recent years, however, the tremendous progress made in data collection and information technology is lifting the limits we have faced for decades in the gathering of quantitative social, demographic, and behavioral data. Improved techniques and methodologies support the inter linkage and integration of digitalized datasets with geo-coding information, economical, and transportation databases.
Data-driven epidemic modeling relies on a multitude of datasets, ranging from population movement records to social and behavioral data, as well as census and health-related information. These data are the bricks that build the pillars of any computational approach to the analysis of infectious diseases. This chapter is dedicated to the illustration of the extraordinary high-quality datasets that can be leveraged to build those pillars.
THE AGE STRUCTURE OF THE WORLD POPULATION
As we go about our daily lives, we come in contact with individuals who can carry viruses, bacteria, and other pathogens, several of which are capable of causing disease and infection. Whenever the conditions are favorable, they may be able to infect us, turning us into a new vehicle for the spreading of the disease. It is not surprising then that the geographical diffusion of pathogens is thus intrinsically intertwined with human mobility. Every time a person moves from home to work, or travels to another city or country, an opportunity arises for the pathogen to spread to a new population of potential hosts.
As technology evolves, so do our traveling habits: automobiles, trains, and airplanes help to shorten physical and temporal distances, for humans and for pathogens.
Over the course of 100 years, flying went from an occupation of the eccentric, to a luxury of the wealthy and finally to a necessity in the life of millions of passengers every year.
Airline data also refers to the so-called origin-destination flows between airports. Origin-destination datasets report the actual number of travelers between airports, regardless of intermediate stops or connecting flights. The origin-destination flows network has many more links than the airline network, which only considers non stop connections. This network accounts for the number of individuals traveling from place to place, thus providing a more accurate picture of the dispersion of individuals potentially carrying diseases.
Trains, public transportation, and personal automobiles have impacted short-distance travel in much the same way that accessible airline transportation has affected long-distance spreading patterns. Now, more than ever, we can easily cover tens of miles just to go from home to the office or school and return home at the end of the day. In doing so, we greatly increase the number of people we come in contact with daily and with it the opportunities for a disease to spread.
MOBILITY PATTERNS AND EPIDEMIC SPREADING
Needless to say, the knowledge of the pathogenic agent itself is critical input in the analysis of infectious disease spreading. No single model fits all diseases. Their basic features and properties have critical effects in the spreading process, and each model must be tailored to the particular illness under study. In this book, we focus on a particular subset of infectious diseases, which are caused by viruses and are transmitted through human-to-human interactions. We explicitly consider influenza viruses, coronaviruses, and filoviruses. A different virus characterizes each of these diseases. Their biological features define the natural history, mortality, and transmissibility of the disease. In the following section, we provide a basic description of each of these diseases, summarizing their biological characteristics and historical spreading.
There are three types of influenza viruses that affect humans: type A, B, and C. The first type is the most diffused and responsible for the majority of epidemics and pandemics in our collective history. Type B and C are rarer and associated with localized outbreaks. Influenza A viruses are further divided into subtypes based on their biological structure. They are made of eight single-stranded RNA segments with two major glycoproteins in the surface: HA hemagglutinin and NA neuraminidase. There are at least 16 different HA and 9 different NA. The predominant subtypes found in humans are H1, H2, H3, and N1, N2. Their combinations form the influenza virus. In recent history, we have been subjected to H1N1, H2N2, and H3N2 pandemics. Each of these subtypes can be further divided in different strains defined by the specific combination of genes and proteins.
The virulence and fatality rate change from strain to strain. Influenza viruses affect animals as well and can sometimes be transferred from animals to humans. In particular, swine (H1 and H3), ducks (H7), and chickens (H5 and H9) have been shown to infect humans. There are three mechanisms behind the diffusion of influenza viruses: droplet, airborne, and contact transmission. In the first class, an infected individual coughs or sneezes, diffusing large droplets that reach conductive or mucous membranes of a susceptible person. In the second case, there is no direct contact between the droplets and the susceptible person; they can be vaporized in the air and become breathable by susceptible individuals. In the last case, the virus can be transmitted from person to person through contact with the secretions of an infected individual or through physical contact between individuals. After being in contact with the virus, susceptible individuals might become infected. If this is the case, the virus starts reproducing inside the new host. Typically, the symptoms arise after the first day. The infectiousness increases, while the virus reproduces itself and reaches its peak after 2–3 days on average, after which the viral load then starts to decline thanks to the reaction of the immune system.
Coronaviruses belong to the subfamily Coronavirinae in the family Coronaviridae. They are enveloped viruses with a positively stranded RNA genome. Several proteins contribute to the biological structure of coronaviruses. In particular, there are spike, envelope, membrane, and nucleocapsid proteins.
The transmission mechanisms of coronaviruses are typical of influenza-like illnesses (ILI). However, other characteristics are quite different. For example, in the case of SARS, the proportion of asymptomatic infections was relatively small. The maximal infectiousness registered occurred about 7 days after the onset of symptoms. The virus responsible for SARS is different from all other known coronaviruses. It appears to have originally been an animal virus that crossed to humans. Indeed, the virus has been isolated from civet cats in the Guangdong Province. In this region, there are many markets in which civet cats and other exotic animals are sold. A large fraction of workers in these markets were found to be seropositive for SARS. However, it is not clear if civet cats or other animals are the natural reservoir of the virus in the wild.
Filoviruses appear in a variety of “threadlike” virions (infectious viral particles) and encode their genome in a negative-sense RNA. Common shapes include long branching filaments, and shorter filaments shaped like a “6,” the “U”-shaped filament, and even circles.
It is unknown how the virus is transmitted from its natural reservoir to humans. Once a human is infected, ways of transmission include close personal contact with infected individuals or their bodily fluids. Caregivers are at a higher risk of becoming infected due to close contact with the infectious individual. During outbreaks, the isolation of patients and the use of protective clothing and disinfection procedures are crucial to interrupt the transmission of filoviruses. In 2015, the first vaccine was developed to treat the Ebola virus, and field trials began in West Africa shortly thereafter.
Some countries are more ready than others to deal with the risks associated with epidemic diseases. Different indicators are commonly used to evaluate each country’s capacity to respond: for instance, the number of physicians per capita and the number of beds per 10,000 individuals. It is clear that countries with a good health infrastructure will be better at combating the spreading of an epidemic as well as reducing its impact on the population. Understanding and mapping health infrastructures and other socioeconomic indicators are extremely important in the building of models that can provide insight on the disease burden across different countries and measure the risk associated to emerging pathogens.
Distribution of health infrastructures
List of top and last 10 countries per number of physicians and hospital beds per 10,000 people.
Source: Global Health Observatory data repository, World Health Organization, 2009
In the next chapter we discuss how all of this data can be incorporated within the framework of epidemic modeling.