Introduction

The unprecedented outbreak and pandemic of the coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have been declared a global emergency by the World Health Organization (WHO) [1]. This pandemic has challenged many nations, healthcare systems, public policymakers, healthcare professionals, and many other stakeholders involved in the public’s health and safety [2, 3].

Socioeconomic and racial disparities in access to health care services, health status, and health outcomes are well-documented in the literature [4,5,6,7,8,9]. Aside from the individual-level characteristics that impact outcomes, populations residing in neighborhoods with higher concentrations of socioeconomic disadvantages and racial/ethnic minorities are at greater risk of experiencing different forms of health disparities [10,11,12,13,14]. For example, those living in areas of concentrated economic hardship or higher racial/ethnic minority concentration are more likely to experience adverse health outcomes such as reduced prostate cancer survival rates, increased infant mortality, and higher incidence of premature deaths [15,16,17]. In addition to these findings, the infectious nature of COVID-19 highlights the importance of examining spatially linked measures as community members’ behavior and circumstances can impact the spread of infection and survival [18]. For instance, the ability to work remotely or isolate from infected household members reduces infection risk.

To date, individual patient-level and aggregated population-level studies have shown that Black and Hispanic minorities are more frequently hospitalized due to COVID-19 and experience a higher rate of death due to the disease and compounding factors [19,20,21,22,23].

We hypothesized that living in severely disadvantaged or racially segregated neighborhoods may lead to disproportionately higher COVID-19 deaths. The theoretical concept underlying our work is that the negative impact of several indicators of low socioeconomic status in a synergic manner is far more significant than that of a single indicator alone, such as the percentage of population below the poverty level [24]. Neighborhood level poverty is often strongly associated with other spatially linked disadvantages, including percent of households receiving public assistance, percent of female-headed families, and racial segregation [25]. The spatialized nature of structural and environmental racism and socioeconomic disadvantage, along with the infectious nature of COVID-19, calls for attention beyond the individual characteristics of residents of a neighborhood.

Our study is informed by the social determinants of health, the ecological system theory, and the social disorganization theory [26,27,28]. Our objective was to examine the association between the COVID-19 deaths at the county level and the percentage of county population residing in socioeconomic and racial segregation. We used concentrated disadvantage and Black concentration at tract-level to measure socioeconomic segregation and racial segregation, respectively.

Methods

Data Sources

We included 73,056 tracts and 3142 counties in 50 states and the District of Columbia in this study. We accessed the most recent county population estimates from the United States Census Bureau website for 2019 [29].

We obtained the county-level data for confirmed COVID-19 deaths from January 22 to July 21, 2020, for all 50 states and the District of Columbia from the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University [30]. This publicly available data was initially gathered from the Centers for Disease Control and Prevention (CDC), local, state, and territory health departments, and other sources [31].

To ensure the availability of data for smaller counties with a population of less than 20,000, we used American Community Survey (ACS) 5-year estimates (2014–2018), which are based on data collected over 60 months from January 1, 2014, to December 31, 2018 [32]. Data from ACS 1-year and 3-year estimates were limited to areas with populations of more than 20,000. We used both county and census tract level estimations in this study.

We used the county-level 2018–2019 release of Area Health Resources File (AHRF) for several variables. AHRF provides comprehensive county-level information on population characteristics, demographics, environment, and healthcare facilities and professionals [33]. We obtained data on the intensive care unit (ICU) beds in each county from the Kaiser Family Foundation [34].

To account for differences in the county population, we adjusted all variables for county population size and included the transformed variables in the analyses in proportion or percentage form. We merged all datasets using the county Federal Information Processing Standards (FIPS) code as each county’s unique identifier. The data sources and variables used in this study are presented in Appendix Table 1.

Measures

COVID-19 Deaths

Confirmed COVID-19 deaths per 100,000 population was the outcome variable of interest. We constructed this measure using the total number of confirmed COVID-19 deaths in each county as of July 21, 2020, divided by the 2019 county population and multiplied by 100,000. We then rounded the results to the closest whole number to create a value appropriate for statistical analysis for count data.

Socioeconomic and Racial Segregation

The percentage of the Black population or socioeconomically disadvantaged population at the county level may not be proper measures of racial and socioeconomic segregation [35]. Counties with the same percentage of the Black or low-SES (socioeconomic status) population may have different population distribution patterns ranging from highly segregated neighborhoods to proportionately distributed racial and socioeconomic groups at the neighborhood level. To mitigate this issue, we constructed two tract-level variables.

We first checked the correlation between potential variables commonly used in previous studies to construct the concentrated disadvantage variable [36]. Then, we performed principal component analysis and identified the five variables that loaded onto a single factor accounting for 67.1% of the observed variation with high reliability (Cronbach’s Alpha α = 0.84): (1) percentage of population below poverty level, (2) percentage of households receiving public assistance, (3) percentage of female-headed households, (4) unemployment rate, and (5) percentage of people 25 years or older with less than high school education. To create a single index for each tract, we converted each of the five values to Z scores and calculated the average Z score for each tract. Then, we ranked the final scores from lowest to highest. We classified tracts at the top quartile (75th percentile) as those with concentrated disadvantage. We then divided the population of these tracts by the total county population to generate the county population’s percentage residing in concentrated disadvantage. Similar approaches were used in several studies before [7, 37].

We used the Black population percentage at the tract-level as a proxy to construct a variable to capture the Black-concentrated neighborhoods to isolate the potential disparities because of racial segregation [38,39,40]. We classified tracts as Black-concentrated neighborhoods if 25% or more of the residents were Black. We calculated the county population’s percentage in these tracts to construct the percentage of people living in Black-concentrated neighborhoods. The detailed descriptive statistics and correlation matrix for the tract level variables used in this study are included in Appendix Table 2.

Covariates

We included the percentage of 45–64 years old and 65 years old and older in county populations to control for high-risk groups with higher rates of COVID-19 mortality [41]. We included population density (persons per square mile) and core-based statistical area (CBSA) status of the county to account for the potential impact of population density and the degree of urbanization on the speed of the spread of COVID-19 and other airborne respiratory infectious diseases [42].

Furthermore, we included population-size-adjusted ICU and hospital beds and percent of the uninsured population to adjust for possible variation in COVID-19 deaths attributable to differences in availability and accessibility of healthcare resources and services.

We included a measure of geographical census regions (Northeast, Midwest, South, and West) to adjust for the potential impacts of climate and environmental-based factors such as temperature, humidity, and precipitation on the spread of COVID-19 or severity of the infection [43,44,45,46].

To control for the differences in duration of county-level exposure to the pathogen, we included a variable for the number of days passed since the first confirmed COVID-19 case in each county.

Statistical Analysis

The discreet (count) data for our dependent variable (COVID-19 deaths per 100,000 population) was highly right-skewed and over-dispersed. The concentration of the values around zero (counties with no or a low number of confirmed COVID-19 deaths) was disproportionately higher, and the variance was much larger than the mean. Because of these conditions and to account for potential state-level variations in COVID-19 related containment policies regarding social distancing, testing, and reporting, we used mixed-effects negative binomial regression models to examine the association between county-level COVID-19 mortality and other characteristics with random effects at the state level. We took this measure to avoid errors due to state-level variation in factors such as the issuance, length, and restrictiveness of social distancing measures and state-specific reporting criteria and testing availability, potentially impacting the spread of COVID-19 and subsequent deaths.

We log-transformed several independent variables before including those variables in the analysis. We ranked the counties by population and checked to ensure that no disproportionate deaths were reported for counties with less than 1000 population.

We performed several sensitivity analyses with different inclusion criteria to check for the presence and the persistence of the observed relations. We ran a model including all-cause 2019 deaths at the county level to adjust for deaths primarily attributable to causes other than COVID-19. Despite the other pre-existing health conditions or comorbidities, all deaths that are confirmed or suspected deaths due to COVID-19 are included in COVID-19 deaths reports [47, 48]. Clinical studies indicate that it takes a minimum of approximately 2 to 8 weeks from the onset of symptoms to death in COVID-19 patients [49]. Because of this and the ongoing nature of the COVID-19 pandemic, we ran a model limited to counties that reached a threshold of 10 cases per 100,000 population 2 weeks before the analysis date (July 21, 2020) to adjust for the course of illness. Since most racially segregated and concentrated disadvantage neighborhoods are more likely to lack health-related resources and access to healthcare services necessary to treat the severe COVID-19 cases [50], we performed another analysis with main models but without controlling for two variables that are characteristics of tract-level racial and economic segregation: ICU beds and uninsurance rate.

We ran an additional 18 models with different cutoffs (thresholds) for classifying racial and socioeconomically segregated tracts. Given that the majority of racially segregated Black neighborhoods are also economically disadvantaged, we performed another analysis including an interaction term between concentrated disadvantage and Black concentration. We performed all analyses in Stata MP version 16.1 [51].

Results

Table 1 contains the county-level population characteristics used in this study. We transformed all variables to proportions (or percentages) by dividing the county level counts by the total county population to adjust for the county population size. The average confirmed COVID-19 deaths among the 3142 counties was 20.4 per 100,000 population. However, as expected, the average confirmed COVID-19 deaths among counties with more than 10 cases 2 weeks before data analysis (counties in the epidemic phase) was 21.1 per 100,000 population. About 13.7% of the US population resided in concentrated disadvantage tracts, and 13.3% of the population was residing in Black concentration tracts defined as tracts in which 25% or more residents were Black. These percentages were slightly higher among counties that entered the county-wide pandemic phase. This result might suggest that counties with a higher percentage of the population resided in concentrated disadvantage and Black-concentrated tracts entered the epidemic phase earlier than other counties.

Table 1 Characteristics of the US counties included in the analysis until July 21, 2020. Mean (standard deviation), minimum-maximum, and frequency (percentage)

Table 2 contains results from the multivariate mixed-effects negative binomial regression models. Mortality rate ratios (MRR) greater than one indicate a positive association between covariates and confirmed COVID-19 deaths at the county level. Conversely, an MMR less than one indicates a negative or reverse association. For every 10% increase in the county population’s percentage resided in concentrated disadvantage, the confirmed COVID-19 deaths per 100,000 county population increases by a factor of 1.14 (95% CI: 1.11, 1.18). Similarly, for every 10 % increase in the county population’s percentage resided in Black concentration neighborhoods, the confirmed COVID-19 deaths per 100,000 county population increases by a factor of 1.11 (95% CI: 1.08, 1.14). In other words, for every 10% increase in the proportion of county population residing in concentrated disadvantage, the ratio of COVID-19 deaths increases by about 14%. Likewise, a 10% increase in the proportion of county population residing in segregated Black neighborhoods increases COVID-19 deaths by about 11%.

Table 2 Mortality rate ratios (MRR), 95% confidence intervals (CI), and P values in the main model including all counties with at least one reported case as of July 21, 2020, United States (N = 3091)

Other county-level characteristics such as population density and percentage of uninsured county population were also positively associated with confirmed COVID-19 deaths (MMR > 1 and p < 0.05). In terms of the availability of healthcare facilities, the population-adjusted hospital beds were associated with higher deaths. However, ICU beds per 100,000 county population were negatively associated with COVID-19 deaths. That is, counties with higher per capita ICU beds had disproportionately lower confirmed COVID-19 deaths per 100,000 population. Western and southern US counties had significantly more confirmed COVID-19 deaths per 100,000 population than northeastern counties. The differences between the rural, micropolitan, and metropolitan counties were not statistically significant (p > 0.1).

The MMRs resulted from each model in sensitivity analysis are presented in Fig. 1 and Appendix Table 3. These models include the bivariate unadjusted models; a model with only two segregation variables; a model excluding the New York county (outlier); a model with all-cause 2019 deaths as a predictor; a model only including counties that reached an epidemic threshold of more than 10 cases per 100,000 population 2 weeks before July 21, 2020; and a model excluding the percentage of uninsured and ICU beds. The observed significant relations between the segregation and COVID-19 deaths persisted in all models.

Fig. 1
figure 1

Mortality rate ratios (MRR) and 95% confidence intervals from the select models included in the sensitivity analysis. MRR can be interpreted as a percentage increase in the confirmed COVID-19 deaths with a 10% increase in the county population’s percentage residing in tracts with concentrated disadvantage or Black concentration

Figure 2 shows the plot of interaction between the percentage of county population residing in socioeconomic and racial segregation. This model was constructed using the Main model and an interaction term between the percentage of county population residing in concentrated disadvantage and the percentage of county population residing in Black concentration. The marginal predicted mean is the highest on the top-right and the lowest on the graph’s bottom-left corners. This indicates that the joint contribution of living in concentrated disadvantage and concentrated Black tracts on COVID-19 deaths is greater than the individual impact of each of two segregation variables.

Fig. 2
figure 2

Interaction between the percentage of county population residing in concentrated disadvantage tracts and the percentage of county population residing in concentrated Black tracts. The joint increase in both percentages is associated with a greater increase in COVID-19 deaths than an increase in either measure of segregation

We used three different thresholds for classifying tracts as concentrated disadvantage based on the overall concentrated disadvantage score for further sensitivity analysis. After ranking all tracts, we classified the top 50%, top 25%, and top 10% tracts as those with concentrated disadvantage. For racial segregation, we used three thresholds of 25%, 50%, and 75% of tract residents as Black or White, to construct racial segregation. In all models, the higher percentage of county population residing in concentrated disadvantage, or Black concentration was associated with higher COVID-19 deaths per 100,000 county population. However, all similar cutoffs for White race segregation showed the opposite direction of the relation. A higher percentage of county population residing in White concentration was associated with lower COVID-19 deaths per 100,000 population. Results are presented in Appendix Table 4.

Discussion

Our goal was to examine the association between the county population’s percentage residing in socioeconomic or racial segregation and COVID-19 deaths per 100,000 county population. We used tract-level data to construct the variables for measuring segregation. Evidence of the association between both racial and socioeconomic segregation and COVID-19 deaths was apparent in this study. Even after adjusting the model for multiple covariates and accounting for factors that may impact the outcome, this county-level analysis shows that counties with a higher proportion of the population resided in concentrated disadvantage or Black concentration experience disproportionately higher mortality rates due to COVID-19. These findings are in line with previous studies regarding the presence of health disparities in segregated neighborhoods [52,53,54].

Association between socioeconomic characteristics, such as median income and unemployment rate, and COVID-19 deaths, was insignificant or contradicting in several previous studies [23, 55, 56]. We hypothesized that the concomitant effects of several low-socioeconomic characteristics among a higher proportion of the population residing in a neighborhood could have a synergic negative effect on COVID-19 outcomes compared to the presence of a single sporadic low-SES indicator among a lower proportion of the population in a neighborhood.

The contribution of racial and socioeconomic segregation at community (tract) level to observed disparities in COVID-19 deaths at the aggregated county level can potentially be explained using two groups of factors: (1) factors that facilitate the transmission of the coronavirus from one person to another, and (2) factors that decrease the chances of survival among the infected. In other words, several characteristics of highly segregated Black or low-SES neighborhoods can increase the risk factors of getting infected by a coronavirus and dying as a result of COVID-19 [57, 58]. Therefore, the potential contributing factors can be grouped into aggregated individual or household characteristics and those environmental and spatial characteristics resulting solely from segregation. Given the airborne transmission and highly contagious nature of coronavirus, the individual risk factors can impact residents more severely than in a case of non-contagious diseases or health conditions.

At the individual level, several factors can explain these observed disparities. For example, Black and low-SES individuals in the USA are more likely to be employed as essential workers in occupations such as food distribution, truckers, and janitors. Most of these jobs cannot be fulfilled remotely and usually do not offer adequate sick leaves [59, 60]. Additionally, individuals of low-SES and Black minority are disproportionately impacted by homelessness or reside in housing units with limited space that makes the practice of isolating infected family members challenging or impossible [61, 62]. Moreover, limited or no child/elderly care and higher uninsurance rates impose an additional financial burden on low-SES families making it challenging to stop working.

As seen in the Tuskegee Study, historical extreme medical racism and more subtle medical racism that has persisted to the present have resulted in distrust in the medical community (especially non-Black caregivers) in some Black communities [63, 64]. This can potentially impact individuals’ knowledge, attitudes, and behaviors regarding the transmission of COVID-19, proper social distancing and containment measures, testing, and therapeutic interventions, resulted in an increased risk of exposure and a decreased chance of survival.

In addition to the characteristics mentioned earlier, which, collectively, may impose a higher burden of COVID-19 on racially and socioeconomically segregated populations, several factors can be directly related to place, going beyond the individual or household. For example, it has been shown that higher levels of environmental air pollution are associated with higher COVID-19 deaths [65]. Environmental racism and injustice, in which racial minority or low-SES groups are disproportionately exposed to environmental risk factors and pollutants such as polluting factories and toxic waste dumps, are more likely to be present in racially and socioeconomically segregated neighborhoods [66,67,68]. The negative impacts of structural racism on health inequities and the intersections with structural COVID-19 risk factors have been shown in the literature [69,70,71].

High-quality medical care is another factor that can increase the chances of survival among COVID-19 patients. If affordable, the quality of healthcare services provided in segregated Black and low-SES neighborhoods is usually lower than services provided in healthcare institutions and facilities located in predominantly White and high-SES neighborhoods [72,73,74].

It is imperative to design and implement policies and interventions to mitigate the spread of coronavirus or lessen the health complications and mortality resulting from COVID-19 in segregated communities that suffer from a disproportionate number of deaths. In doing so, both individual and neighborhood factors contributing to COVID-19 morbidity and mortality should be addressed simultaneously.

Utilizing county or other geographical level data in epidemiological and public health studies to understand geospatial differences in health outcomes is well documented [75, 76]. Using tract-level population characteristics to construct and measure concentrated disadvantage and Black concentration and examine their association with county-level COVID-19 deaths improves our understanding of the potential divisive role of segregation on COVID-19 survival disparities. The use of tract-level data in this study relatively mitigated the problems with aggregated population-level analysis at the broader county level [77, 78]. However, to overcome some of our study’s limitations, future multilevel analyses with data on individual characteristics such as COVID-19 outcomes and neighborhood factors can produce more robust conclusions after isolating individual and neighborhood-level variables’ contribution to COVID-19 deaths. The studies on the spread of COVID-19 should also consider potential spatial autocorrelations. Using death rates standardized for age and pre-existing conditions can increase the study rigor given the future availability of comprehensive data on COVID-19 death rates by age and comorbidities.

Conclusions

COVID-19 deaths at the county level appear to be positively associated with the proportion of county population residing in concentrated disadvantage and concentrated Black neighborhoods (tracts). This study’s findings suggest that socioeconomically and racially segregated neighborhoods are more vulnerable and are more likely to be disproportionately impacted by the adverse effects of COVID-19. Our findings underscore the need to consider neighborhood-level characteristics in designing and implementing policies targeted at COVID-19 containment.