Data extraction
Socioeconomic data were analyzed from the Canadian Census 2006. For our evaluation, we extracted Census data from CANSIM, Canada’s socioeconomic database which provides free access to a range of the latest statistics [10]. The Census was completed on May 16, 2006 and 32.5 million people were included. One in every five households received a long questionnaire with 53 questions in comparison to 8 for the short form [11]. Here, we used data from the long questionnaire forms. These data cover all of Canada’s dissemination areas (DAs), which are small regions consisting of 400 to 700 people [12]. Canada has 52974 DAs, ranging from 34 for Nunavut to 18923 for Ontario.
Variables
A set of 22 variables from the 2006 Census was selected based on: (1) cultural identities [13]; (2) potential environmental pollutants related to health outcomes [14, 15]; (3) Canadian environmental injustice studies [16–18]; and (4) variables utilized in the deprivation index for Canada proposed by Pampalon [3] (Table 1). Studies in the United States have indicated a clear relationship between several racial groups with regards to SES [19–21]. In an effort to investigate the phenomenon in Canada, we grouped the cultural identities reported in the census as the individual’s ancestry, based on four categories from the “Human Developmental Index” (HDI): origins from (1) very high sum; (2) high sum; (3) medium sum; and (4) low sum countries [13]. The HDI takes into account the human development of a country and ranks them according to life expectancy, literacy, education and standards of living [13]. We included a category for those with aboriginal identities based on responses to the “Indian Status” and “Aboriginal identities” question on the Census and combined those who were North American Indian, Metis, Inuit, multiple Aboriginal identities, and aboriginal responses not included elsewhere. Each of the variables was expressed as proportions per dissemination area (DA). Variables obtained as raw counts of the answers to questions were converted to proportions by dividing by the number of people answering the question. Since the data used in the creation of the index were collected from questions answered with the long form of the Census questionnaire, the proportions were based on the variables corresponding to 20 % of the population of Canada. Employment rate, median income and prevalence of low income after taxes were not transformed since they were originally reported as proportions per DA in the census database.
Table 1 Parameters and variables used in the selection for PCA analysis Lastly, we incorporated a variable that we thought was important for health outcomes related to environmental pollution: age of the home (construction of homes before 1946, 1946–1970, 1971–1990, 1991–2006) as a proxy for age of the neighbourhood. In the United States, it has been shown that older neighbourhoods are more likely to have lead paint [15], asbestos [14] and have more infiltration of fine particles from outdoors to indoors [22].
Construction of SES index
Principal component analysis (PCA) with a single varimax rotation (factor loadings ≥ │60│) was performed on the selected 22 Census socioeconomic variables (SAS 9.2, North Carolina, USA). The analyses were completed for all DAs of Canada whereupon we utilized two criteria for the selection of components: (1) Kaiser Criterion (eigenvalues ≥1); and (2) individual proportion of variances per component explaining ≥ 10 % of the overall variability. The final SES index was created by averaging the factor scores (a numerical representation of the linear relationship between variables and the components) per DA, according to the three components retained.
Validation of our SES index
Adverse birth outcomes
We attempted to validate our index by utilizing the well-researched concept that low SES may be related to adverse birth outcomes [23, 24]. Here, data on all singleton live births between 1999 and 2008 in Edmonton were accessed through Statistics Canada (Additional file 1). Pregnancy outcomes under study were preterm birth (gestational age < 37 weeks), term low birth weight (LBW, <2500 g), and small for gestational age (SGA, <10 percentile of birth weight for gestational age). Spearman correlations and t-tests were used to assess associations between the index and pregnancy outcomes.
Particulate matter (PM 2.5)
We also evaluated another known [25], but less explored association between our SES index and concentrations of particulate matter with a mean aerodynamic diameter < 2.5 μm (PM2.5). Spearman correlation and t-tests were used to examine relationships between PM2.5 exposures and SES indices. PM2.5 exposures were assigned by mapping the mother’s six-character postal code to a monthly surface PM2.5 concentration, based on a North American land use regression model that incorporated observations from fixed-site monitoring stations and satellite-derived estimates of PM2.5. Exposures were estimated for the entire duration of pregnancy. Methods are described in detail elsewhere [26].
Comparison of Chan index to Pampalon index
We compared the association of our SES index and that of Pampalon’s [3] with adverse birth outcomes and PM2.5 concentrations using Spearman correlations and t-tests. The Pampalon index is a commonly used SES index in Canada that was developed using variables from the 2006 Census with: (1) known relations to health; (2) past use as geographical proxies; (3) past utilizations with the material or social dimensions of deprivation; (4) availability by DA [3]. PCA was used on the variables and two components were found that are now used as the Pampalon indices: Values for the Material and Social components. The Pampalon index value used to validate our index were accessed through their website [3, 27]. Both indices represent the Canadian SES situation in 2006 and comparisons assume the same similar Canada wide SES distribution around the year of 2006.