Special data notes
If a region has an international airport, it was coded as a “2.” If a region has an international airport with direct daily flights to Seattle, New York (JFK or LGA), San Francisco (SFO or OAK), and Washington DC (BWI or DCA), it was coded as a “3.” This coding considers airports within 45 min of the MSA as per Amazon’s RFP when determining access to direct daily flights.
New establishments 100+ and new establishments 500+
This variable shows the total number of new establishments emerging in a given MSA from 2010 to 2014 from new firms with greater than 100 employees. A derivative of this variable looks at the total number of new establishments emerging in a given MSA from 2010 to 2014 from new firms with greater than 500 employees.
Volunteer hours per capita, volunteer rate, and the share of population active in their neighborhood
These three metrics come from 2015 Volunteering and Civic Life in America, a dataset produced by the Corporation for National and Community Service (CNCS). CNCS is an independent federal agency that is dedicated to supporting the American culture of citizenship, service, and responsibility. The data were collected through two supplements to the U.S. Census Bureau’s Current Population Survey (CPS)—the Volunteer Supplement (2015) and the Civic Supplement (2013). The data are reported by MSA before metro definitions were revised in 2015.
Volunteer rate is the percentage of individuals who responded on the Current Population Survey’s Volunteer Supplement that they had performed unpaid volunteer activities at any point during the 12-month period that preceded the survey for or through an organization.
Walk score, transit score, and bike score
The scores are calculated by Walk Score for 141 largest core cities in the U.S. and Canada. Walk Score is a private tech company that originated in Seattle and is now owned by Redfin, a real estate agency.
Walk Score is designed to assess walkability in the area. The score analyzes walking routes to nearby amenities. Points are awarded based on the distance to amenities in each category. Amenities within a 5-min walk (.25 miles) are given maximum points. A decay function is used to give points to more distant amenities, with no points given after a 30 min walk. The score also measures pedestrian friendliness by analyzing population density and road metrics such as block length and intersection density.
Transit Score aims to reflect how well an area is served by public transport. Points are assigned to nearby transit routes based on the frequency, type of route, and distance to the nearest stop on the route.
Bike Score aims to reflect how convenient an area is for biking. Points are awarded based on availability of bike infrastructure (e.g., lanes, trails), hills, road connectivity, and the number of bike commuters. For each score, the points are summed and normalized to a score between 0 and 100. For details, see https://www.walkscore.com/.
Average hours of sunshine per year
The data for this metric come from the World Meteorological Organization Standard Normals dataset, accessed through United Nations Data portal. It measures the mean number of hours of sunshine per year for cities all over the world, including in the U.S. For most of the cities used in our analysis, the reported values are averages computed for the consecutive periods of 30 years, from 1961 to 1990. For details, see http://data.un.org.
Number of good air quality days per year
This metric is designed by U.S. Environmental Protection Agency and measures how clean or polluted air is by MSA, and whether the associated health effects might be a concern. The data are based on Air Quality Index (AQI), which focuses on measuring ground-level ozone and particle pollution. Days are evaluated as good, moderate, unhealthy for sensitive groups, unhealthy, very unhealthy, or hazardous. We use the number of days evaluated as ‘good.’ During ‘good’ days, air quality is considered satisfactory and poses little or no risk. For details, see https://www.epa.gov/outdoor-air-quality-data/air-quality-index-report.
Economic inclusion index and racial inclusion index
These indices are calculated by the Urban Institute for 274 largest cities across the U.S. The Urban Institute is a non-profit research organization based in Washington, DC.
The Economic Inclusion Index measures the ability of residents with lower incomes to contribute to and benefit from economic prosperity. Among the indicators used to calculate the index are income segregation rank, share of renters who pay 35% or more of their income in rent, share of 16 to 19-year olds who are not in school and have not graduated, share of families that are below the poverty line with householder working full-time. Income segregation is computed through estimating the segregation between families above and below each income distribution bucket at the census tract level. The indicator values are then averaged (weighted by income comparative to the median income) to construct the city-level measure. The index is computed as an average of z-scores of these four indicators.
The Racial Inclusion Index measures the ability of residents of color to contribute to and benefit from economic prosperity. Among the indicators used to calculate the index are racial segregation, homeownership gap, educational attainment gap, poverty rate gap, and share of people of color. Racial segregation is calculated as (1/2) * ((# people of color in census tract/# people of color in city)—(# non-Hispanic white in census tract/# non-Hispanic white in city)). The homeownership gap is calculated as a difference between the share of white non-Hispanic households that own a home and the share of persons of color households that own a home. The education attainment gap is calculated as a difference between the share of white non-Hispanic population over 25 with a high school degree or more and the share of the person of color population over 25 with a high school degree or higher. The poverty gap is calculated as a difference between the poverty rate for white non-Hispanic population and the poverty rate for person of color population. The index is computed as an average of z-scores of these five indicators. For details, see https://apps.urban.org/features/inclusion/.
Share of the foreign-born population
This metric is based on the 2012–2016 American Community Survey 5-Year Estimates, Table S0501, accessed through the American FactFinder. The share of foreign-born population is calculated by dividing the estimated number of foreign-born persons by the total population for each area in the analysis. For details, see https://factfinder.census.gov/.
Share of the population who speak only English at home
This metric is based on the 2012–2016 American Community Survey 5-Year Estimates, Table S0601, accessed through the American FactFinder. The share of the population that speaks only English at home is calculated by dividing the number of persons over 5 years old who speak only English at home by the total population over 5 years old. For details, see https://factfinder.cen-sus.gov/.
The share of the population who speak English at home may be a more effective measure for ethnic and cultural diversity of the population than just the share of foreign-born residents. The share of the foreign-born population tells us only about the first-generation migrants, but the share of population who speak languages other than (in addition to) English at home also captures the children of the earlier generations of migrants who are likely to have preserved their cultural identity. The share of the population who speak only English at home is a proximate measure for the level of cultural homogeneity in a given area.
Rate of population growth 2010–2017
This metric is based on the data from Population Estimates Program by U.S. Census Bureau, accessed through American FactFinder. To find the rate of population growth, we divided the 2017 population estimate by the 2010 population estimate and subtracted one from the result. For details, see https://fact-finder.census.gov/.
Violent and property crime rates
For this metric, we use 2014 crime rates as reported by Uniform Crime Reporting Statistics (UCR), U.S. Department of Justice. Crime rate is defined as the number of crimes per 100,000 residents. The crime rates are reported by UCR at the core city level and come from respective city agencies. Violent crime includes murder, rape, robbery, aggravated assault. Property crime includes burglary, larceny-theft, motor vehicle theft. For details, see https://www.bjs.gov/ucrdata/Search/Crime/Crime.cfm.
Median housing cost per month
This metric is based on the 2012–2016 American Community Survey 5-Year Estimates, Table B25105, accessed through American FactFinder.
Share of the population who reported mental distress and share of the population who reported bad physical health in the last 30 days
These metrics report the age-adjusted 2015 estimates from Local Data for Better Health dataset produced by Centers for Disease Control and Prevention, a U.S. federal agency under the Department of Health and Human Services. The dataset contains information for 500 largest core cities and was released in 2017.
The share of population who reported mental distress 14 or more days in the last 30 days was computed by dividing the number of respondents age 18 years or older who report 14 or more days during the past 30 days during which their mental health was not good, by the total number of respondents. The share of population who reported bad physical health 14 or more days in the last 30 days was computed by dividing the number of respondents aged 18 years or older who report 14 or more days during the past 30 days during which their physical health was not good by the total number of respondents. For details, see https://chronicdata.cdc.gov/.
Park Score is calculated by the Trust for Public Land, a U.S.-based Non-governmental Organization dedicated to creating and improving neighborhood parks. The score assesses quality and accessibility of parks in the 100 most populous core cities in the U.S.
The indicators behind the score are grouped into four areas: park acreages, investment, amenities, and access. For acreage, the indicators include median park size and parkland as a share of city area. For investment, the indicators include public spending, non-profit spending, and monetized volunteer hours worked any public parks and recreation agencies. For amenities, the indicators include the number of park amenities per capita, with amenities defined as playgrounds, rest rooms, dog parks, splash pads, recreation and senior centers, and basketball hoops. For access, the indicator is the share of population living within a 10-min walk of residence. Cities can earn a maximum score of 100. For details, see http://parkscore.tpl.org.
The data for this metric come from the Center for American Women and Politics (CAWP) at Rutgers Eagleton Institute of Politics. A value of “1” indicates the central city of the MSA in question had a female mayor as of March 2018, and “0” indicates that the central city had a male mayor. For details, see http://www.cawp.rutgers.edu/levels_of_office/women-mayors-us-cities-2018.
Share of female members in the city council
The data for this metric were collected from individual city council websites for central cities of the MSAs. We divided the number of female members by the total number of members for that council. Mayors were excluded from the calculations.
Share of employees in arts, entertainment, and culture
This metric is based on the 2012–2016 American Community Survey 5-Year Estimates, accessed through American FactFinder. The share of the employees who work in arts, entertainment, and culture industries was calculated by dividing the number of persons who reported employment in these industries by the total population who reported employment. For details, see https://fact-finder.census.gov/.
Government spending, taxation, and labor market freedom scores by state
The three scores are the components of the Economic Freedom of North America (EFNA) Index reported by the Fraser Institute. The Fraser Institute is a think tank headquartered in Vancouver, British Columbia, that produces research about government actions in areas such as taxation, health care, aboriginal issues, education, economic freedom, energy, natural resources, and the environment.
Government Spending scores are designed to reflect the size of the government. Each score is calculated based on the following indicators: general consumption expenditures by government as a percentage of income, transfers and subsidies as a percentage of income, and insurance and retirement payments as a percentage of income. Taxation scores are aimed at assessing the tax burden. The score is calculated based on income and payroll tax revenue as a percentage of income, top marginal income tax rate and the income threshold at which it applies, property tax and other taxes as a percentage of income, and sales taxes as a percentage of income. Labor Market Freedom scores are based on minimum wage legislation, government employment as a percentage of total state/provincial employment, and union density. For each score, states/provinces in the U.S., Canada, and Mexico are included in the analysis, and are awarded points on a scale of 0–10. For details, see https://www.fraserinstitute.org/studies/economic-freedom-of-north-america-2017.
Economic freedom index by MSA
This index comes from a 2013 article by Dean Stansel in Journal of Regional Analysis and Policy. To calculate each score, Stansel used the model of 2011 Economic Freedom of North America (EFNA) Index by the Fraser Institute. As in the EFNA scoring system, points are awarded to MSAs on a scale of 0–10.
While EFNA reports scores only for states/provinces, Stansel’s study uses the model to assess economic freedom at a more granular level. In contrast to EFNA, this was a one-time study and it only includes areas within the United States. While most of the scores in the article are reported by Metropolitan Statistical Area, some are reported by metropolitan statistical division instead. We used the data for MSAs but, when information was available only for metropolitan statistical divisions, we selected the divisions where the central city of the relevant MSA was located. For details, see http://www.jrap-journal.org/pastvol-umes/2010/v43/index431.html.
Rate of new entrepreneurs, opportunity share of new entrepreneurs, and startup density
The data for these three variables come from the 2016 Kauffman Index for Startup Activity produced by the Kauffman Foundation. The Kauffman Foundation focuses its work on education and entrepreneurship.
The rate of new entrepreneurs measures the share of adult population that became entrepreneurship in a given month. The opportunity share of new entrepreneurs measures the share of new entrepreneurs who were not unemployed or in school prior to becoming entrepreneurs. The startup density measures the number of startups per 1,000 firms, where startups are defined as businesses less than 1 year old that employ at least one person beside the owner. For details, see https://www.kauffman.org/kauffman-index.
Number of days with pleasant temperatures per year
For this metric, we used the Global Surface Summary of the Day (GSOD) database of the U.S. National Oceanic and Atmospheric Administration, accessing it through NCEI Climate Data Online Data Search. The days were counted as pleasant if the daily mean temperature was between 59 and 77 F, the maximum temperature did not exceed 85 F, and the minimum temperature did not fall below 45 F. We looked at the time period from 1/1/2007 to 12/31/2017 or smaller periods if data from a single station were unavailable. We then divided the total number of pleasant days by the number of years within the time period considered.
The values recorded by weather stations in or close to the central cities of relevant MSAs were chosen for analysis. Most of the stations chosen were located in international airports to maximize completeness and reliability of data. For details, see https://www7.ncdc.noaa.gov/CDO/cdoselect.cmd.
Republican and Democrat votes in the 2016 presidential election
This metric is based on the final official 2016 presidential election result data reported by the state and county authorities (e.g., boards of elections, county clerks) on their websites. The data are reported by county. For each MSA, we only include the county where the central is located. We calculated vote percentages by dividing the counts of Republican and Democrat votes by the number of the total votes cast. In cases where the necessary data were not easily accessible on a government entity website, we used information from NPR Election 2016 Results special series.
This index is produced by Gallup-Sharecare annually since 2008, based on a survey of 175,000 + respondents. The scores and ranks are reported by MSA.
The survey questions used to calculate the index are associated with one of the five elements of well-being. Among the five elements of well-being that Gallup-Sharecare chose to include are purpose (liking what one does, being motivated to achieve their goals), social (having supportive relationships, love), financial (managing one’s economic life to minimize stress and increase security), community (liking where one lives, feeling safe and proud of one’s community), and physical condition. Gallup categorizes the respondents as thriving, struggling, or suffering for each of the five elements. For details, see https://wellbeingin-dex.sharecare.com/.
Number of major professional sports league teams
This metric reflects the number of professional football, basketball, baseball, and hockey teams headquartered in each MSA included in the analysis that belong to NFL, NBA, MLB, and NHL, respectively. We found the number of teams by consulting NFL, NBA, MLB, and NHL websites.
Data estimates for Toronto
Since Toronto is not included in most data, we made manual estimates. We describe our methodology for the estimates below.
Average hours of sunshine per year The measure is reported as “Total Hours of Bright Sunshine.” It is calculated by the Government of Canada using 1981–2010 station data from Toronto. This source was used in lieu of missing data from the World Meteorological Organization Standard Normals dataset for hours of sunshine. Both sources were last updated in 2010.
Number of good air quality days per year This variable counts the number of “low risk” days (values of 1-3) on the Ontario Ministry of the Environment and Climate Change’s Air Quality Health Index as reported at the Toronto Downtown station. All values are from 2017 (like data from the EPA).
Rate of population growth This measure reflects the population percentage change from 2011–2016. This estimate is slightly distinct from data for MSAs in the U.S., which cover 2010–2017. That said, Statistics Canada’s population by census subdivision exists for census years (2016, 2011, etc.). Quarterly data that could be used to find the 2010-2017 rate of growth are on the country, territory, and providence level.
Share of foreign-born population Data on foreign-born individuals come from Statistics Canada’s 2016 Census that looks at the total number of immigrants born outside of Canada in Toronto. Toronto’s population is from Statistics Canada’s 2016 Census.
Share of employees in arts, entertainment, and culture This measure is calculated by dividing the number of employees in arts, entertainment, and culture by total number of employees in Toronto. The former value comes from Statistics Canada’s 2016 census, which counts the number of workers in “Occupations in art, culture, recreation and sport” with its “National Occupational Classification.” The total number of employees in Toronto comes from their count of the “Total labor force population aged 15 years and over.”
Median housing costs per month This value is the “median monthly shelter costs for rented dwellings” as reported in Dollars from Statistics Canada’s 2016 Census. It is for Toronto only. Rented dwellings were taken because data did not exist for both rented and owned dwellings.
Share of population who speak only English at home This percentage is calculated by dividing the number of individuals with knowledge of only English in Toronto by the total population of Toronto excluding institutional residents (as data on their knowledge of languages were not collected). Data were from Statistic Canada’s 2016 census.