1 Introduction

Influenza and COVID-19 are infectious diseases that have a range of mild to severe symptoms (Moghadami 2017; Rothan and Byareddy 2020). While a vaccine exists for influenza, there is not yet a vaccine widely available for COVID-19 (Houser and Subbarao 2015; Rothan and Byrareddy 2020). However, other preventative strategies exist for minimizing the spread of both diseases (Rabie and Curtis 2006; Rothan and Byrareddy 2020). This study models the use of social media as a tool to increase social awareness and prevention/treatment practices, which in turn can mitigate the COVID-19 pandemic as well as future influenza epidemics.

Influenza is a contagious respiratory disease. The Types A and B influenza viruses are responsible for seasonal flu epidemics and outbreaks each year. The flu is generally characterized by abrupt symptoms and individuals largely recovering within eight days, after the incubation period (Moghadami 2017; Gaitonde et al. 2019; Petrova and Russell 2018). However, young children, those with underlying conditions, and adults older than 65 are at high risk of serious influenza complications, which can lead to hospitalization and death (Clayville 2011; Schmid et al. 2017). The Centers for Disease Control and Prevention (CDC) reports that since 2010, influenza has caused around 9 million to 45 million illnesses, 140,000 to 810,000 hospitalizations, and 12,000 to 61,000 deaths in the United States each year (see Fig. 1) (CDC, n.d.-a).

Fig. 1
figure 1

Source: CDC, n.d.-a)

Burden of influenza epidemics (

Effective influenza control strategies include vaccination, the use of drug therapy, handwashing, and social distancing (Moghadami 2017; Houser and Subbarao 2015; Roth and Henry 2011). However, the annual flu vaccination is considered the best method for flu prevention (Houser and Subbarao 2015). Recent CDC studies show that flu vaccination reduces the risk of flu illness between 40 and 60% among the overall population when the flu vaccine matches the circulating flu viruses (CDC, n.d.-b), but rates of vaccine coverage and vaccine effectiveness (VE) in the US are not optimal. In the 2018–2019 flu season, CDC estimates that 62.6% of children (6 months to 17 years old) in the US had at least one dose of flu vaccine, which is 4.7 and 3.6 percentage points higher than the 2017–2018 and 2016–2017 flu season, respectively; and 45.3% of adults in the US had at least one dose of flu vaccine, which is 8.2 and 2.0 percentage points higher than the 2017–2018 and the 2016–2017 flu season, respectively. Vaccine coverage among children and adults is shown in Fig. 2 (CDC, n.d.-d), and vaccine effectiveness has ranged from 19% (2014–2015 season) to 60% (2010–2011 season) (CDC, n.d.-c). Views on the influenza vaccine can create a barrier in individuals receiving his or her annual flu shot (Chen et al. 2020). When investigated, researchers found that a low perception of both vaccines and vaccine effectiveness, as well as an absence of trust in health authorities, affected an individual’s decision to get the vaccine. Additional reasoning included an individual’s feeling that those in his or her peer group did not receive the vaccine, low belief that influenza is severe, and low belief that himself or herself is at risk for influenza (Schmid et al. 2017).

Fig. 2
figure 2

Source: CDC, n.d.-c; -d)

Influenza vaccine effectiveness and influenza vaccine coverage. Data source (

COVID-19 is an illness caused by the SARS-CoV-2 virus, and primarily passes from person-to-person via respiratory droplets when individuals are in close proximity (Chavez et al. 2020). The disease mainly manifests in mild to moderate illness involving upper respiratory symptoms or mild pneumonia, but can also result in critical illness, severe pneumonia, and respiratory failure (CDC, n.d.-e). Individuals may also be asymptomatic. Populations that are at a higher risk include those of an older age or with preexisting medical conditions (Chavez et al. 2020).

In addition to the current development of potential vaccination for the disease, multiple antiviral drugs are undergoing testing for COVID-19 treatment (CDC, n.d.-e). Current recommendations in place to prevent spread include the wearing of masks and social distancing (Andersen 2020; Courtemanche et al. 2020). Testing for the virus is available, and emergency medicine practitioners identify and isolate those who have or are at risk for having the infection, when presented with patients (Chavez et al. 2020). As of July 27, 2020, there have been 4,225,687 reported cases of COVID-19 in the United States, and 146,546 reported deaths (CDC, n.d.-f). Figure 3 below depicts reported recent US hospitalizations and cumulative deaths as collected by (The Covid Tracking Project 2020). Available information on burden of the disease should be interpreted with prudence due to limitations, such as the early shortage of tests, misdiagnosis, and the asymptomatic nature of many cases.

Fig. 3
figure 3

Source: The Covid Tracking Project 2020)

Burden of COVID-19 pandemic (

While the rates of vaccine effectiveness and vaccine coverage for influenza are not optimal, and a vaccine is not yet available for COVID-19, handwashing and social distancing practices, ranging from moderate to total isolation can be used at all times during an infectious disease outbreak (Tam et al. 2006). In practice, the more preventive knowledge people have, the better they are able to protect themselves by adopting necessary measures. Media reports can contain such preventative knowledge, and are able to influence the behavior of the public (Wakefield et al. 2010, 2011; Funk et al. 2014; Collinson et al. 2015). These media report outlets include informative literature (i.e., pamphlets), posters, newspaper articles and advertisements, radio and television messages, and social media (i.e., Twitter, Facebook).

Social media is prevalent today, particularly among the younger generation. Thus, it has been used in real-time analysis and for faster trend predictions in many areas (Moorhead et al. 2013; Mishra and Singh 2018) such as traffic, waste, disaster prediction, and networking. It can serve as a resource for disease surveillance and is an efficient way to communicate preventative actions to slow spread during disease outbreaks (Corley et al. 2010; Mowery 2016; Anparasan and Lejeune 2019).

Many studies in literature have used social media, specifically Twitter, to monitor the population’s health (Abbasi et al. 2014; Paul et al. 2016). Several researchers have used Twitter data to monitor influenza prevalence (Culotta 2010; Signorini et al. 2011), to predict disease transmission between individuals (Sadilek et al. 2012), and to forecast future prevalence (Paul et al. 2014). Additionally, some studies have analyzed attitudes and sentiment toward vaccination using Twitter (Salathe and Khandelwal 2011; Salathe et al. 2013; Dunn et al. 2015; Dredze et al. 2016). The use of social media data to detect the spread of epidemics or pandemics, such as the flu or COVID-19 can help to obtain early warnings. New techniques for analysis of search engine logs (Polgreen et al. 2008; Ginsberg et al. 2009; Majumder et al. 2015) and social media data can be used to obtain real-time analysis, creating better services.

Social media can elicit positive behavior changes of the public, and therefore help to reduce the risk of infection in the population. For example, Ahmed et al. (2018) examined the relationships among social media use, social media as a source of health information, and influenza vaccination status in 2015. Their results indicate that those who use Twitter and Facebook as sources of health information were more likely to be vaccinated than users who do not use Twitter or Facebook as sources of health information.

In general, social media sites are Internet platforms in which an individual creates a profile and associated list of fellow users, and views posts of various forms from others on the platform (Boyd and Ellison 2007; Kullar et al. 2020). Twitter and Facebook are the largest two social networks (Kallas 2020). Twitter is a microblogging social media platform where individuals communicate through tweets, which are posts that are 280 characters or less and allow for the use of hashtags to indicate group topics. The platform had a wide and active use base of 152 million daily users in 2019, and many healthcare professionals use the social media site to deliver real-time health-related information across the globe (Kullar et al. 2020; Holcomb 2011). Twitter has been reported to be the most highly utilized social media platform for healthcare communication (Pershad et al. 2018; Kullar et al. 2020). In a review of studies that used social networking platforms to predict and detect influenza, Twitter was a common representative of the social media component. This review reported several benefits to using Twitter including that the age of Twitter users is varied, posts are descriptive and highly frequent, and demographic details of users can be available (Alessa and Faezipour 2018). Additionally, Twitter uses short messages and hashtags to communicate, provides convenience to collect data on an event, and is used by reliable institutions to collect data on events (Holcomb 2011). For these reasons, as well as in an effort to obtain accurate and cohesive data, we have set tweets via the Twitter platform to represent the social media component of our model.

Today, the humanitarian response logistics of many disasters, including influenza epidemics and the COVID-19 pandemic, likely utilize social media. Current disaster relief operations have the benefit of immediate information dissemination with wide reach. Through social media, individuals can be directed to journals and healthcare institution guidelines to seek prevention advice and understand symptoms. For individuals with mild illness, tools and guidelines from reliable sources can alleviate the burden on health systems during disasters. In dealing with an emerging disaster, scientific research can be facilitated at a high speed, with additional social media data and connectivity between research institutions. Moreover, the humanitarian response to the current COVID-19 pandemic has shown the importance of social media during a disaster in the form of other several other social factors, including increased capability of remote learning and access to psychological aid (Merchant and Lurie 2020). These items emphasize the need for social media in emergency preparedness today. Merchant and Lurie provide a powerful viewpoint that underscores the motivation for our study and its application in the humanitarian logistics of a pandemic response (2020).

This paper aims to study and quantify the effectiveness of using social media as a humanitarian response to mitigate influenza epidemics and the current COVID-19 pandemic. We extend the standard SEIR-V model to incorporate social media in order to increase the accuracy of transmission dynamics, and perform design of experiments and stochastic simulations to examine the following research questions:

  1. 1.

    Is social media a beneficial behavioral intervention for infectious diseases?

  2. 2.

    How has the inclusion of social media affected number of cases and deaths due to influenza and COVID-19?

  3. 3.

    What is the most effective strategy of social media use on the response to infectious diseases?

Our results indicate that social media has a positive effect in mitigating the spread of contagious disease and a synergistic effect with other preventative and mitigating policies. We found that social media’s effect has a non-linear relationship with the reproduction number \({R}_{0}\), and is most effective when \({R}_{0}\) is between 1.5 and 1.9. Social media’s effect on seasonal influenza would be more evident if a vaccine is used, and is accompanied with other measures and policies in the mitigation of COVID-19.

The remainder of the paper is organized as follows. Section 2 offers a review of related literature while also identifying research gaps and how this study will address them. Section 3 offers the structure of the proposed generic model based on the SEIR dynamic transmission compartmental model by capturing the information of social media for influenza epidemic (see Fig. 4) and COVID-19 pandemic (see Fig. 5). This section also offers an estimation of parameters, detailed numerical analysis, and results obtained by running the two models, separately. Section 4 offers a detailed discussion on findings from running the two disease models. Section 5 offers major takeaways including research contribution from this study. Finally, Sect. 6 discusses limitations of this study and potential future work.

Fig. 4
figure 4

SEIR-V model for seasonal influenza modified for social media

Fig. 5
figure 5

SEIR model for COVID-19 modified for social media

2 Literature review

Digital information has many uses in the study and mitigation of infectious diseases and disaster situations (Fast et al. 2018; Dubey et al. 2019a, b; Singh et al. 2019; DuHadway et al. 2019; Wamba et al. 2019). Some evidence also exists (Griffith et al. 2019) that the big data analytics and AI technologies can assist visibility (e.g. with open-source imagery tools and analytic mapping tools) in disaster relief operations, but this implementation process requires further investigation (Dubey et al. 2019a, b).

Social media is an omnipresent part of our world, and the current COVID-19 pandemic has shown what an integral role it plays in news, science, and personal communication. Human behavior is affected by information given and received via social media, and thus, social media is a factor in the use of intervention during times of infectious diseases. There is a dearth of infectious disease mathematical models that involve social media and its effects on behaviors. Ultimately, including this factor could greatly improve the accuracy and true reflection of disease spread.

Social media has recently been used in the detection, surveillance, and prediction of the flu via a variety of methods, including text mining, graph data mining, topic models, machine learning techniques, internal market, external market, math/statistical based models, and mechanistic disease models (Alessa and Faezipour 2018). Social media can be used to assess people’s sentiments regarding vaccinations (Salathe and Khandelwal 2011) and to track vaccination uptake (Huang et al. 2017, 2019). Aslam et al. (2014) successfully increases the immediacy of influenza surveillance through tweets. Santillana et al. (2015) proposes an improved ILI machine learning prediction model utilizing social media along with other data sources. By using these methods, authors were able to predict weekly ILI estimates up to four weeks prior to the release of the CDC’s ILI reports. Social media is an important tool that has the potential to be used for data collection amidst future influenza outbreaks (Allen et al. 2016).

Social media is now also being used as a method of data collection for COVID-19. During the emergence of a new disease, such as COVID-19, data on the spread and burden of a disease are essential. Social media can assist in surveillance when there is a dearth of information and data available on an illness, such as COVID-19 (e.g., Li et al. 2020). Li et al. (2020) found that for every 40 social media posts, there were approximately ten additional reported COVID-19 cases in the region. Studies that mined Twitter for COVID-related data (e.g., Mackey et al. 2020) have been emerging over the last few months as part of an effort to fill in the large knowledge gaps on data surrounding the disease (Qazi et al. 2020).

A common method that is used in studying the spread of both influenza and COVID-19 is the use of a compartmental model, which will also be employed throughout this paper. This method is effective in its incorporation of a constantly changing transmission rate, as is characteristic of infectious diseases, such as influenza and COVID-19. For example, Wang et al. (2011) extended a SEIR epidemic model that incorporated influenza-related complications. Yang et al.’s (2018) paper analyzed cost-effectiveness of the universal influenza vaccination using a similar epidemic model. Sah et al. (2018) used an age-structured dynamic model showing influenza transmissions and vaccinations to study the 2017–2018 flu season and analyze how low-efficacy vaccinations still had an impact. Kucharski et al. (2020) utilized a stochastic SEIR dynamic transmission model and data on COVID-19 cases in and originating in Wuhan to evaluate spread of the disease in January and February of 2020. Matrajt and Leung (2020) used an age-structured SEIR transmission model to investigate how social distancing affected the spread of COVID-19.

However, fewer studies exist that incorporate social media effects. Pawelek et al. (2014) built a transmission model with susceptible, exposed, and infected compartments, while also incorporating the number of tweets that were affiliated with influenza. Authors concluded that Twitter may be better used for surveillance, as opposed to being used as an early detector. Mitchell and Ross (2016) utilized a deterministic SEEIIR-M model, which included two compartments for exposed and infected people, as well as one susceptible and one recovered with media compartment, finding a relationship between media awareness and the size of an outbreak.

In summary, most studies in current literature focus on the use of social media data in forecasting and surveillance of infectious diseases, but few examine its use in controlling an outbreak. Our study will further the knowledge of the effectiveness of using social media to mitigate an infectious disease outbreak, such as influenza epidemics or the current COVID-19 pandemic. In this paper, we incorporate the concept that social media can impact individual behavior during a pandemic. We investigate how using social media reduces an outbreak at different contagiousness levels of an infectious disease, as well as how using social media interacts with other mitigation measures (e.g., vaccination). We hope that our findings underscore the importance and multitude of uses of social media data collection during infectious disease outbreaks, and help decision makers to prepare effective responses to public health crises.

3 The model

The system of differential equations has been used to study the effect of mass media on epidemics by employing the well-known Susceptible-Exposed-Infectious-Recovered (SEIR) model and various extensions (e.g., Tchuenche et al. 2011; Cui et al. 2008). In this study, mass media has been incorporated using different, but qualitatively similar, functions that directly affected disease transmission and susceptibility. In general, the chosen functions are decreasing functions with respect to the current number of infected individuals in the population. Our generic model extended the SEIR dynamic transmission compartmental model by capturing the information from social media for influenza epidemics (see Fig. 4) and the COVID-19 pandemic (see Fig. 5), respectively. \(M(t)\) is the total number of tweets about the infectious disease at any given time. Since there is no vaccine widely available yet for the COVID-19 disease, the generic model for COVID-19 pandemic shown in Fig. 5 does not include a vaccine compartment “V” nor the two vaccine rates v and v1 in Fig. 4 (generic model for influenza epidemic).

Variables and parameters shown in Figs. 4 and 5, as well as the respective system of differential equations given in (1) and (2) for the two models are as follows: S contains the susceptible individuals who are not influenced by the tweets. S1 contains the susceptible individuals who read and are influenced by the tweets. V includes all vaccinated individuals. Flow rates between compartments are defined by model parameters. Exposed individuals enter the infectious compartment I, at the rate \(\sigma ,\) and infectious individuals enter the recovered compartment R, at rate γ, with immunity thereafter, or enter the death compartment D at rate \(\delta \). For influenza, the susceptible individuals in S are vaccinated at per capita rate \(v\), and the susceptible individuals in S1 are vaccinated at per capita rate \({v}_{1}\). For COVID-19, since no vaccine is available, variable V is set to 0 and vaccine rates v and v1 are both set to 0. Individuals who are influenced by the tweets at time t will move to S1 at the rate of \(\tau M(t)\). The transmission rates \(\beta \) and \({\beta }_{1}\) are the rates at which a susceptible individual in S and S1 is infected by infectious individuals, respectively. N is the total population.

Seasonal influenza:

$$\frac{dS}{dt}=-\frac{\beta }{N}IS-\tau M(t)S-vS$$
(1)
$$\frac{d{S}_{1}}{dt}=-\frac{{\beta }_{1}}{N}I{S}_{1}+\tau M\left(t\right)S-{v}_{1}{S}_{1}$$
$$\frac{dV}{dt}=vS+{v}_{1}{S}_{1}$$
$$\frac{dE}{dt}=\frac{\beta }{N}IS+\frac{{\beta }_{1}}{N}I{S}_{1}-\sigma E$$
$$\frac{dI}{dt}=\sigma E-\gamma I-\delta D$$
$$\frac{dR}{dt}=\gamma I$$
$$\frac{dD}{dt}=\delta D$$

COVID-19:

$$\frac{dS}{dt}=-\frac{\beta }{N}IS-\tau M(t)S$$
$$\frac{d{S}_{1}}{dt}=-\frac{{\beta }_{1}}{N}I{S}_{1}+\tau M\left(t\right)S$$
$$\frac{dE}{dt}=\frac{\beta }{N}IS+\frac{{\beta }_{1}}{N}I{S}_{1}-\sigma E$$
(3)
$$\frac{dI}{dt}=\sigma E-\gamma I-\delta D$$
$$\frac{dR}{dt}=\gamma I$$
$$\frac{dD}{dt}=\delta D$$

Section 3.1 provides details on the estimation of parameters in the SEIR-V model for seasonal influenza modified for social media, whereas Sect. 3.2 provides numerical analysis and results obtained from running the model. Section 3.3 provides separate numerical analysis, and is dedicated to the SEIR COVID-19 model modified for social media with a different set of parameter estimations than those used for seasonal influenza.

3.1 Model parameters for influenza

We obtained the number of flu tweets from HealthTweets.org. The data collection process in HealthTweets.org uses “streams” of public data of Twitter. The “health” stream downloads only tweets containing any of the 269 health-related keywords, which include lists for possessive words, flu related words, fear related words, “self” words, and “other” words. The specific data collection methodology is explained in Lamb et al. (2013). Figure 6 plots the daily normalized number of flu tweets in the 2018–2019 season. These data will serve as \(M\left(t\right),\) where t is a day.

Fig. 6
figure 6

The normalized number of tweets about influenza in 2018–2019 season by day

According to the CDC, handwashing, one of the most effective prevention measures, can reduce the risk of respiratory infections by 16–21% (CDC, n.d.-h). We therefore assume that:

$${\beta }_{1}=\beta \left(1-\eta \right),$$
(3)

where \(\eta \in (16\%, 21\%)\). There is no available data regarding parameters \(\tau ,v,{v}_{1}\) defined in this model. We developed the following three conditions based on observable information to formulate an optimization model to estimate these parameters.

Flu vaccine coverage in the 2018–2019 season was 45.3% among adults, and 62.6% among those less than 18 years old (CDC, n.d.-d). The 2010 US census reported that 76% of Americans are adult, and 24% are less than 18 years old. Thus, the average flu vaccine coverage in the US population is approximately (\(45.3\%\times 76\% +62.6\%\times 24\%) =49.5\%\). Because the vaccine effectiveness in 2018–2019 was 29%, the effective vaccine coverage in 2018–2019 was approximately (\(49.5\%\times 29\%)=14.36\%\) (Condition 1).

Ahmed et al. (2018) found that the odds of getting the flu vaccine among Twitter users is 4.4 times the odds among non-Twitter users. There are 330 million active Twitter users worldwide (Statista 2019a, b -a) and 262 million international active Twitter users (Statista 2019a, b). Thus, there are approximately (330–262) = 68 million active Twitter users in the US, which is about 21% of the total US population (about 327.2 million people, see USAFacts 2019). Let \(x\) be the number of non-Twitter user getting vaccinated and \({x}_{1}\) be the number of Twitter user getting vaccinated. Assume US population = 100, then the number of Twitter users = 21, the number of non-Twitter users = 79, and the number of total people getting vaccinated = 50. So, we have:

$$x+{x}_{1}=50$$
(3)
$$\frac{{x}_{1}}{21-{x}_{1}}=\frac{4.4x}{79-x}$$

Solving Eqs. (3), we have x = 33.9, \({x}_{1}\)=16.1. In other words, among all people receiving the vaccine in the entire season, the number of people from compartment S is 2.1(= 33.9/16.1) times the number of people from \({S}_{1}\) (Condition 2).

Assume \(p\%\) of Twitter users are influenced by Twitter, but not all of them get vaccines. Assume the US population = 100, then the size of \({S}_{1}\le 21\). Because \({x}_{1}\)=16.1, the size of \({S}_{1}\ge \) 16.1, i.e., 16.1% of the size of total population \(\le \) the size of \({S}_{1}\) \(\le \) 21% of the total population (Condition 3).

Let \(\widehat{V}\) be the total number of people vaccinated at the end of a flu season, \(\widehat{{S}_{1}}\) be the total number of people who have entered \({S}_{1}\) by the end of a flu season, and \(\widehat{SV}\) be the total number of people vaccinated who are from \(S\), and \(\widehat{{S}_{1}V}\) be the total number of people vaccinated who are from \({S}_{1}\). According to Conditions 1 through 3, we formulate an optimization problem as follows to estimate \(\tau ,v,{v}_{1}\).

$$\mathrm{min}\left|\widehat{SV}-2.1\widehat{{S}_{1}V}\right|$$

subject to

$$16.1\%N\le \widehat{{S}_{1}}\le 21\%N$$
$$14.3\%N\le \widehat{V}\le 14.4\%N$$
$${v}_{1}\ge v\ge 0$$

We numerically solve the above optimization problem (4) by searching the solution space through iteratively running the differential Eqs. (1) in MATLAB. Tables 1 below summarizes the initial parameters used in the model to solve (4).

Table 1 Model Parameters

When \(\eta =16\%\), the best found solution is \(v=4.16\times {10}^{-4},{v}_{1}=9.92\times {10}^{-4}, \tau =3.66\times {10}^{-3}\), with the objective value of 0.0010; when \(\eta =21\%\), the best found solution is \(v=4.07\times {10}^{-4},{v}_{1}=10.21\times {10}^{-4}, \tau =3.48\times {10}^{-3}\), with the objective value of 0.0004. In order to accommodate potential errors in estimation, instead of using the single best solution, we include all feasible solutions that we found numerically as long as the objective function \(\left|\widehat{SV}-2.1\widehat{{S}_{1}V}\right|<0.5\). Let the solution set \({S}_{\eta }=\left\{\tau ,v,{v}_{1}|\left|\widehat{SV}-2.1\widehat{{S}_{1}V}\right|<0.5, \eta \right\}\). \(\left|{S}_{\eta =16\%}\right|=519\), and \(\left|{S}_{\eta =21\%}\right|=385\). The descriptive statistics of the two solution sets in Table 2 show that the solutions are very close to one another within \({S}_{\eta }\) and across \({S}_{\eta }\). The solutions are insensitive to the value of \(\eta \).

Table 2 Descriptive statistics of the optimal solutions

3.2 Numerical analysis: influenza

We use the following performance measures to evaluate the effectiveness of social media: (1) peak time when the infected is at its maximum, (2) peak magnitude, which is the number of people who are infected at the peak time, (3) total infected, which is the cumulative number of people who get influenza by the end of the season, (4) total vaccinated, which is the cumulative number of people who get vaccinated by the end of the season, and (5) the total deaths caused by influenza.

We design an experiment to examine the effects of three factors: social media, vaccine, and reproduction number \({R}_{0}\) (see Table 3). Social Media has three levels, respectively representing no effect, low transmission rate reduction, and high transmission rate reduction. Vaccine includes two levels, with and without vaccine. Biggerstaff et al. (2014) found that the median value of the \({R}_{0}\) of seasonal epidemic influenza is 1.28 with the 25th percentile of 1.19 and the 75th percentile of 1.37, according to 24 studies on 47 seasonal flu epidemics. Chowell et al. (2008) found the \({R}_{0}\) of seasonal epidemic influenza with the mean of 1.3 and the year-to-year variability (range 0.9–2.1). Cowling et al. (2010) studied the 2019 H1N1 infections in Hong Kong and found that the effective reproduction number ranges from 1.1 to 1.5. Thus, in our study, we set three levels to \({R}_{0}\), where \({R}_{0}\)=1.1 represents a less mild seasonal influenza epidemic, \({R}_{0}\)=1.28 represents a typical seasonal influenza epidemic, and \({R}_{0}\)=1.5 represents a severe seasonal influenza epidemic. There are a total of 18 \(=3\times 2\times 3\) design points.

Table 3 Design of experiment

We use a stochastic simulation model to examine the 18 design points listed in Table 3, and we ran 1,000 replications on each design point. To simulate the process in a design as in Table 3, we model, \(\Delta {N}_{ij}\), the number of people moving from compartment \(i\) to \(j\) over a time interval Δt as a binomial random variable (e.g. King and Ionides 2016). Table 4 displays the specific binomial distributions.

Table 4 Probability distributions of the number moving between compartments over time interval Δt

The simulation procedure developed in MATLAB is described as follows.

figure a

Figure 7 shows the main effects of \({R}_{0}\), social media, and the vaccine on the five responses based on the simulation results. Among the three factors, \({R}_{0}\) has the largest effect on all five responses. As \({R}_{0}\) increases (i.e., the disease becomes more contagious), the peak magnitude, the total infected, and the total deaths all increase sharply; but the peak time shortens, and the vaccine coverage decreases. For example, in a severe flu season (\({R}_{0}=1.5\)), the total infected could be 4 times and the peak magnitude could be nearly 8 times as in a mild flu season (\({R}_{0}=1.1\)), and it would peak about 40 days sooner. However, the difference between the peak times when \({R}_{0}=1.1\) and \({R}_{0}=1.28\) is statistically insignificant.

Fig. 7
figure 7

Main effect of R0, social media and vaccine

Social media displays a similar pattern on peak timing, peak magnitude, total infected, and total deaths. As the effect of social media increases, the peak timing, the peak magnitude, the total infected, and the total deaths all decrease, but the vaccine coverage increases. For example, when social media is in effect (\(\eta =0.21\)), the total infected could be reduced by about 11% and the peak magnitude could be reduced by 12% from the level when there is no social media, and it would peak about 7 days sooner. As the effect of social media \(\eta \) increases from 0.16 to 0.21, the decreases in the peak time, the peak magnitude, the total infected, and the total deaths are relatively small, and statistically insignificant.

Vaccine has significant impact on the influenza progression. When a vaccine is present (about 14% of the population is effectively vaccinated), the peak time, the peak magnitude, the total infected, and the total deaths are all significantly decreased in comparison to a scenario with no vaccine. For example, the total infected and the peak magnitude could be both reduced by 21%, and it would peak approximately 16 days sooner.

We also examine the interaction effects between these factors on all five responses. Almost no interaction between the factors was observed on the peak magnitude, total infected, total deaths, and vaccine coverage, except the peak time. Figure 8 displays the interaction effect between any pair of the factors on the peak time. Vaccine has no practically meaningful effect on the peak time when \({R}_{0}=1.28\) or 1.5, but the peak time is significantly shortened when \({R}_{0}=1.1\). In other words, the 14% effective vaccine coverage is large enough for a mild influenza season so that the time to peak can be shortened by 46 days, but would have limited effect on peak time during a regular or severe influenza season. A similar pattern also presents with social media. The peak time remains largely unchanged regardless of social media use during a regular or a severe influenza season, but peak time could be shortened by about 20 days during a mild season. Vaccine and social media can amplify the other’s effect on the peak time. When vaccine presents, social media can further shorten the time to peak, in comparison to no vaccine. Similarly, when social media presents, vaccine can shorten the peak time even more, in comparison to no social media.

Fig. 8
figure 8

Interaction plot for peak time

Note that Figs. 7 and 8 are created in Minitab 18.

3.3 Numerical analysis: COVID-19

COVID-19 is a coronavirus, which is a type of pathogen that generally affects the human respiratory system (Rothan and Byrareddy 2020). An outbreak was predicted due to the early reported reproduction number of the virus, which was thought to be greater than 1 (Zhao et al. 2020). Patient age and state of his or her immune system affects the length of this time period, and common symptoms include fever, fatigue, dry cough, headache, as well as a variety of issues within the respiratory tract and intestine. The virus is thought to spread primarily via person-to-person transmission, such as through droplet spread or direct contact. Efforts to reduce person-to-person spread—particularly among susceptible populations—has been a top priority in the control of the disease spread (Rothan and Byrareddy 2020). With no vaccine available, social distancing is considered a vital method to slow spread of the disease. Several researchers have examined social distancing measures in the context of COVID-19 spread in the United States. Mandatory measures were found to have been effective, and results of voluntary measures vary across numerous demographic traits and media consumption levels, which were associated with different levels of social distancing across counties (Andersen 2020; Courtemanche et al. 2020).

In this section, we use the same performance measures to evaluate the effectiveness of social media in the COVID-19 pandemic: (1) peak time when the infected is at its maximum, (2) peak magnitude, which is the number of people who are infected at the peak time, (3) total infected, which the cumulative number of people who get influenza by the end of the season, and (4) the total deaths caused by COVID-19.

We use the same simulation procedure as described above to examine the effectiveness of social media. Table 5 shows all the values of the input parameters for the model. The raw number of COVID-19 tweets in the world is provided by Lamsal (2020). Ninety four keywords and hashtags are used to collect tweets, such as "corona", "coronavirus", "covid", "covid19", "covid-19", "sarscov2", "sars cov2", "sars cov 2", "quarantine", "flatten the curve", "#flattenthecurve." The data collection process made significant changes (e.g., adding more coronavirus-specific keywords) on April 18, 2020 and May 16, 2020, respectively; which resulted in substantial increases in the number of tweets collected since. Thus, we adjusted the raw data to make the daily number of tweets consistent during the time period from March 22, 2020 to July 20, 2020.Footnote 1 We also normalized these numbers in the same way that HealthTweets.org normalizes the number of tweets of influenza (Broniatowski et al. 2013). Figure 9 plots the daily normalized number of COVID-19 tweets in the world from March 22, 2020 to July 20, 2020. These data will serve as M(t) where t is a day.

Table 5 Model parameters
Fig. 9
figure 9

Number of normalized tweets about COVID-19

According to Chu et al. (2020), the social distancing measures—including keeping distance, wearing masks, and eye covers—could decrease the transmission risk by 7.5% to 15.9%. Thus, we set up three scenarios for \(\eta \in \left\{0, 7.5\%, 15.9\%\right\}.\) As discussed earlier, the solutions of \(\tau \) are insensitive to the values of \(\eta \). Thus, we combined the two solution sets, \({S}_{16\%}\cup {S}_{21\%}\), from which we randomly drew \(\tau \) in the simulation process. For each value of \(\eta \), we ran 1,000 replications.

The simulation results show that social media only has a statistically significant effect on the peak time and the total number of deaths when \(\eta \) is sufficiently large. As \(\eta \) increases from 0 to 7.5%, there is no statistical significance on the changes of the peak time nor the number of deaths. But as \(\eta \) increases from 0 to 15.9%, the peak time would increase by 0.6% and the number of deaths would be reduced by 0.25%, with statistical significance. However, such differences have little practical meaning.

As \(\eta \) increases, the peak magnitude and the total number of infected would decrease with statistical significance. Figures 10 and 11 show the means with 95% confidence intervals based on simulation results. In other words, social media could reduce the highest amount of infected per day and reduce the total number of infected. However, social media is less effective on COVID-19 than influenza, in terms of the percentage of reduction during the peak magnitude and the total infected.

Fig. 10
figure 10

Peak magnitude as \(\eta \) increases

Fig. 11
figure 11

Total infected as \(\eta \) increases

From the above results of the COVID-19 example, we discovered that the effectiveness of social media on mitigating the infectious disease depends on how contagious the disease is. R0 describes the intensity of an infectious disease outbreak. In order to examine this phenomenon, we perform further simulation experiments, where R0 is set from 1.1 to 3.87 with an increment of 0.2. In this range, \({R}_{0}=1.1\) represents an infectious disease similar to a mild seasonal influenza epidemic and \({R}_{0}=3.87\) represents a pandemic as COVID-19. For each value of R0, we run the simulation model with 1,000 replications using the parameters specified in Table 5. The differences of the peak time, the peak magnitude, the total infected, and the total deaths between the situations without social media (\(\eta =0\)) and with social media (\(\eta =7.5\%\) and \(\eta =15.9\%\)) are calculated respectively for each R0. Figures 12,13,14 and15 illustrate the average difference with the 95% confidence interval at each R0.

Fig. 12
figure 12

Differences in peak time as R0 changes

Fig. 13
figure 13

Differences in peak magnitude as R0 changes

Fig. 14
figure 14

Differences in total infected as R0 changes

Fig. 15
figure 15

Differences in total deaths as R0 changes

Peak time, when the number of infected a day reaches the highest level, can be prolonged or shortened when social media is in effect. The extension or reduction in peak time is influenced by \({R}_{0}\) and \(\eta \). Figure 12 demonstrates a non-linear pattern between the reduction in peak time and \({R}_{0}\). Social media prolongs peak time if \({R}_{0}=1.3\), but shortens peak time if \({R}_{0}>1.5\). The largest reduction on peak time appears at \({R}_{0}=1.9\), and such reduction on peak time gradually decreases as \({R}_{0}\) increases if \({R}_{0}>1.9\). Regardless of \({R}_{0}\), higher \(\eta \) leads to more peak time reductions.

Peak magnitude, or the highest number of infected in a day, can be reduced when social media is in effect. Figure 13 demonstrates a non-linear pattern between the reduction in peak magnitude and \({R}_{0}\). The largest reduction on peak magnitude appears at \({R}_{0}=1.7\). If \({R}_{0}<1.7\), lower \({R}_{0}\) will result in smaller reduction in peak magnitude; and the reduction on peak magnitude gradually decreases as \({R}_{0}\) increases, if \({R}_{0}>1.7\). Regardless of \({R}_{0}\), higher \(\eta \) leads to more peak magnitude reductions.

Total infected, or the number of total infected in the entire time duration, can be reduced when social media is in effect. Figure 14 demonstrates a non-linear pattern between the reduction in total infected and \({R}_{0}\). The largest reduction in total infected appears at \({R}_{0}=1.7\). If \({R}_{0}<1.7\), lower \({R}_{0}\) will result in smaller reduction in the number of total infected; and the reduction on the total infected decreases as \({R}_{0}\) increases if \({R}_{0}>1.7\). Social media’s effect nearly vanishes when \({R}_{0}=1.1\) and \({R}_{0}=3.87\). Regardless of \({R}_{0}\), higher \(\eta \) leads to greater reductions in the total infected.

Total deaths, or the number of total deaths in the entire time duration, can be reduced when social media is in effect. Figure 15 demonstrates a similar non-linear pattern between the reduction in total deaths and \({R}_{0}\). The largest reduction in total deaths appears at \({R}_{0}=1.9\). Regardless of \({R}_{0}\), higher \(\eta \) leads to more total infected reductions. Social media’s effect nearly vanishes when \({R}_{0}=1.1\) and \({R}_{0}=3.87\). Regardless of \({R}_{0}\), higher \(\eta \) leads to higher total death reductions.

Social media impacts the four metrics differently. Peak time appears to be least affected by social media. Social media’s effect on total infected and total deaths is more sensitive to \({R}_{0}\) than it is compared to peak magnitude. When \({R}_{0}>1.9\), the reductions in total infected and total deaths caused by social media diminish more quickly than the reductions in peak magnitude. Overall, social media is less effective when the infectious disease is mild or very severe, and social media is most effective in mitigating the pandemic if the disease’s \({R}_{0}\) is between 1.5 and 1.9.

4 Discussion

Social media is an omnipresent part of today’s world. We suggest that moving forward, it should be considered as a part of the larger body of preventative social tools that can be used to spread awareness and mitigate the disease, and be synergistically used with other measures and policies to be most effective. Although misinformation, amongst other factors, are large drawbacks in the usage of social media during endemics and pandemics, it is vital to understand how we can use this tool to benefit public health, especially during an infectious disease outbreak.

Understanding the complexities between social media and subsequent changes in behavior in the context of infectious disease can aid the advancement of public policy regarding social media utilization in healthcare communication. With the growing knowledge in this area, resources could more efficiently be allocated towards public awareness campaigns for the prevention of and response to pandemics, and further research could determine specific social media strategies that would be most beneficial to mitigate the spread of future diseases (Mitchell and Ross 2016). With a better understanding of the effect social media has on infectious disease transmission, physicians, scientists, researchers, and other members of the healthcare community may also be influenced to appropriately incorporate social media into their healthcare communication strategies in prevention, ongoing, and recovery phases of pandemics; and this could be achieved through sharing up-to-date applicable knowledge and protective behaviors. In addition, social media companies may wish to have empirical information to better understand how their platforms do and could aid in other emergency situations.

We hope understanding how best to use social media during pandemics and emergencies, as well as how to use social media in conjunction with other interventions, will be practically applied in future emergency situations. Real-world examples of using Twitter to mitigate situations have shown promise in disaster response. For example, in 2012, Fairfax County experienced a rapidly-moving group of severe and destructive thunderstorms. The Fairfax County Office of Public Affairs had begun incorporating social media and similar outreach technologies in prior years, and was able to utilize social media to directly communicate with community members and keep them updated with vital information in real-time. Twitter was used to spread disaster-related information at a rapid rate, as well as guide the public to other sources of information. As Twitter has the mechanisms of retweeting, community-specific hashtags, and similar tools, information regarding a 911 outage and other guiding updates spread through the platform and were able to contribute to keeping community members safe (Space and Naval Warfare Systems Center Atlantic 2013). As more information emerges on how best to utilize Twitter and other social media platforms, such successful implementation of social media use in emergency situations may become more commonplace.

During the present SARS-COV-2 outbreak, individuals have become more attuned to the notion of social media as a source of public health information. This can currently be seen in action as the public utilizes Twitter to share information on awareness, medical breakthroughs, emerging guidelines and rules, and best practices to minimize adverse outcomes from the current pandemic. This study has provided results that show empirical evidence of the potential effects Twitter could have on outcomes of both coronavirus and influenza outbreaks.

Another practical implication of our study is to underscore that social media is an important, model-supported intervention for pandemics and other disasters in which behaviors strongly affect health and safety outcomes. Quantifying that social media campaigns are crucial intervention in outbreak response is one of the first steps in implementing modern plans for various communities. Further, understanding how social media may enhance the effectiveness of other interventions can provide a powerful strategy for those who have the ability to plan emergency response measures. This study will add greater comprehension of how the multifaceted relationship of social media, awareness, and transmission can be most accurately integrated into infectious disease transmission modelling. As we are currently experiencing, mathematical modelling is integral to infectious disease spread predictions and potential policies enacted to safeguard our communities. The emergence of a social media component requires a great amount of study and understanding in order to best be incorporated into these models moving forward (Mitchell and Ross 2016).

Finally, much of the world’s population has experienced at least some sort of restriction to their home due to social distancing and quarantine guidelines, which may have resulted in an increased usage of social media to stay connected, keep up-to-date on coronavirus news and scientific information, and share support with others. As social media use becomes more commonplace in public health, it is essential for studies to understand how best to utilize it, backed by quantifiable data.

5 Conclusion

Through our analysis, we determined that social media is able to have an effect in mitigating the spread of contagious disease in terms of peak time, peak magnitude, total infected, and total death. Particularly, we found that social media’s effect is amplified when a vaccine is available.

Social media’s effect has a non-linear relationship with \({R}_{0}\), and we found that social media is more effective in mitigating the pandemic if \({R}_{0}\) is between 1.5 and 1.9. This implies that if the effective reproduction number of a severe infectious disease, such as COVID-19, can be brought down to this range through other policies, the effectiveness of social media will be amplified and have a synergistic effect. However, \({R}_{0}\) of seasonal influenza is likely to be less than 1.5, and \({R}_{0}\) of COVID-19 is likely to be greater than 1.9. This implies that the effects on controlling these diseases are limited if we rely on social media only. Social media’s effect on seasonal influenza would be more apparent if vaccine is also used, and social media should also be accompanied with other measures and policies in the mitigation of COVID-19.

6 Limitations and future scope

There are several limitations in our study that should be considered in its interpretation. The parameter values, such as R0, used in our compartmental model for both seasonal influenza and COVID-19 are based on CDC data and other published estimates for various disease attributes. The specific range of \({R}_{0},\) where social media is most effective, should be re-evaluated for different diseases, although we believe the non-linear relationship between social media’s effect and \({R}_{0}\) should remain unchanged. Twitter data for both diseases may vary depending on which keywords are used, and data extraction utilizing social media will likely be refined as an increasing number of studies use this source.

An important consideration for future study is effects of other forms of social media, including Facebook, Instagram, LinkedIn, etc., which were not included in our model. By focusing on Twitter, certain populations may be underrepresented in our sample data. According to a recent national survey conducted by the Pew Research Center (Wojcik and Hughes 2019), Twitter users are younger, more leaning towards Democrats, more educated and have higher incomes than the general public in the U.S. Their views are likely to differ from the general public on political and social issues, such as immigration, racial and gender-based inequality. On other issues, the views of Twitter users are similar to those of all U.S. adults. Seasonal influenza seems to be a non-political and non-social issue, while the COVID-19 becomes a more political issue in the U.S. Fortunately, our analysis is based on the quantity rather than the content of the COVID-19 related tweets. Thus, we believe that the potential bias, if any, due to using Twitter in our study has limited impact on the results of our analysis. Additionally, while Twitter provides important information regarding healthcare communication through social media, it does not wholly replace data retrieved from all platforms. Other social media sites provide different forms of information and posts that are also leveraged for healthcare communication, and their impacts should be studied in the context of how they affect infectious disease transmission in future models.

Finally, we have primarily utilized social media in our model as a tool to help inform individuals, which we believe may help influence behavior and subsequently affect disease spread through constantly-updated knowledge of the relevant viruses and best practices for prevention. However, social media could also potentially provide benefit during the recovery stage of a pandemic, or other possible future disasters. This factor is also worth further investigation to better understand the comprehensive role of social media in disaster situations.