Sources of Data
Human Case Data
The Canadian province of Ontario is made up of 36 public health units and all confirmed cases of Gd are reported to the provincial Integrated Public Health Information System (iPHIS). The WHR encompasses part of the multi-use Grand River watershed located in southwestern Ontario (Fig. 1). In 2011, the 1369 km2 region was densely populated with 507 096 people, 815 cattle farms and 210 swine farms (Statistics Canada 2011).
Confirmed cases of Gd are reported to iPHIS based on the finding of cysts or trophozoites in stool samples from patients with compatible clinical signs (after 2009, demonstration of Gd antigen by enzyme immunoassay or immunochromatographic test was also accepted in place of microscopy) (Ontario Ministry of Health and Long-Term Care 2016). The case report date could be the date of symptom onset, the specimen submission date, the lab report date or the date the case was reported to the database. Underreporting of enteric diseases is considerable because asymptomatic, transient or mildly affected patients may not attend a health care provider (Murphy et al. 2016). Therefore, every unique submitted case to iPHIS was used to maximize the study population. This study population is thought to be representative of the patients suffering impairment from disease. Cases who reported travel outside of the province of Ontario during the disease incubation period, or who could not confirm their travel history during the incubation period, were excluded. All cases in the dataset for the time series were identified as sporadic based on the provincial outbreak numbering system. Cases not identified by the local public health unit as being linked to an outbreak were considered to have occurred independently of other cases and coded with a unique identifier.
Climate and Hydrology Data
The WHR has a temperate climate consisting of distinct seasons made up of warm summers and cold winters with below freezing temperatures (Peel et al. 2007). Weather and hydrology observations for this region were measured daily over the study period from 1 June 2006 to 31 December 2013. The environmental variables included mean air temperature (°C) and total precipitation (mm), river water level (m) and river water flow rate (m3/s) (Environment and Climate Change Canada 2016a, b). Studies on the physical characteristics of Giardia have found that temperature is an important parameter influencing survival and recovery of viable cysts (Erickson and Ortega 2006). The focus of this study on waterborne transmission meant that precipitation, water flow and water level were variables of interest based on our a priori hypothesis. Weather was recorded at the Region of Waterloo International Airport, Waterloo, Ontario; daily water flow and water level observations were collected on the Grand River at Doon, Waterloo County, Ontario (Environment and Climate Change Canada 2016a, b).
Livestock Reservoir Data
The pilot site of the FoodNet sentinel surveillance program of the Public Health Agency of Canada operated in the WHR between 2005 and 2014. Farm level surveillance of livestock manure samples was conducted. Each month, one or more farm type (dairy, beef or swine farm) was visited and 3 fresh, pooled manure samples and one pooled manure pit sample were collected (Public Health Agency of Canada 2010). Each fresh sample comprised manure from five animals, representing different ages, and the manure pit sample comprised up to five subsamples collected from different depths. Samples were tested by microscopy and PCR. At least one positive sample per farm per collection date underwent further molecular subtyping to determine the assemblage (Public Health Agency of Canada 2010). PCR and/or microscopy results for manure samples were recorded as positive, negative, or not done. A farm was labelled positive if at least one sample from the farm on a given collection date was positive for Gd by PCR and/or microscopy. Numeric cyst counts were not performed. Aggregated counts of Giardia-positive farms per manure testing date were compiled to form a time series of positive livestock reservoirs in the region between 1 June 2006 and 31 December 2008.
Data Organization
The study was separated into two parts to reflect the two objectives: Objective I evaluated associations between human cases and environmental variables between 1 June 2006 and 31 December 2013. For objective II, a subset of observations from Objective I collected between 1 June 2006 and 31 December 2008 were merged with the time series of Giardia-positive farms (livestock reservoir data) to create a dataset that integrated human, animal and environmental data.
Statistical Analysis
Seasonality
Seasonal fluctuation of human Giardia incidence was assessed for statistical significance by linear combination of sine and cosine terms. We employed methods previously described by Ng et al. (2008) to use monthly sine and cosine variables to assess seasonality and a year term to evaluate annual trends:
$$ E\left( Y \right) = \exp \left( {\alpha + \beta 1\left( {\text{year}} \right) + \beta 2\left( {\sin \left( {2\pi {\text{ month}}/12} \right)} \right) + \beta 3\left( {\cos \left( {2\pi {\text{ month}}/12} \right)} \right)} \right) $$
where α is a constant, β is a regression coefficient for year or month and E(Y) is the expected case count for a given month (Ng et al. 2008).
Annual variation and statistical significance of the seasonal smoothing terms were evaluated using a Poisson regression model with no predictor variables included. Significant smoothing terms (P ≤ 0.05) were included in univariable analysis models of predictor variables; oscillating seasonal smoothers (sine and cosine terms) and the year variable were forced into all multivariable models to adjust for the expected confounding effects of seasonal and annual trends.
Multivariable Regression Models
For each objective, Poisson regression analysis was used to examine temporal associations among average monthly weather conditions (temperature and precipitation), hydrological conditions (river flow and level), and monthly aggregated Giardia-positive farms (Objective II) on monthly aggregated human case counts. Canadian census information provided the total annual population at risk to determine the incidence rate ratio of giardiasis in people in WHR. A 1-month lag period for environmental and livestock reservoir variables was chosen as it included the incubation period of Gd in humans (approximately 7–14 days) as well as time for the pathogen to move from livestock and the environment into the human population. Correlation between predictor variables was assessed (cutpoint > 0.8) and highly correlated pairs of variables reduced by eliminating one of the variables from further analysis. After univariable analysis (adjusted for statistically significant seasonal smoothers), monthly averaged environmental variables and monthly aggregated Giardia-positive farms that had incidence rate ratios (IRRs) that met a liberal P value (P ≤ 0.2) were brought forward into a multivariable Poisson regression model and a backward selection process was used to determine the final multivariable model (P ≤ 0.05). Scatter plots of predicted outcomes and Anscombe residuals were created to assess outliers and covariate patterns with high influence on the model.
Case Crossover Models
A case crossover design was used to assess the impact of acute environmental and livestock exposures on daily human case counts. This approach is useful for rare diseases with a short incubation period and intermittent exposures that create different risk periods through which subjects pass temporally (Maclure 1991; Levy et al. 2001; Ng et al. 2008). The self-matching, case crossover design compares the exposure status immediately before a case occurs to the self-matched exposure status during a control period which is randomly selected to occur prior to, after, or spanning the hazard period (Levy et al. 2001). A time-stratified, 4:1 matched design was used to match four control periods by day of week to the case report date within each 4-week stratum. Conditional logistic regression models were used to assess statistical associations between same week and 1–4-week-lagged exposures and daily human case counts. Environmental variables were ranked from lowest to highest within each stratum to further elucidate which exposure most contributed to risk of human case development. A distributed lag model was used to account for transmission of the pathogen through the environment and the average incubation period in humans.
Case crossover associations were measured as odds ratios (OR). Statistically significant associations between human cases and exposures reported during the concurrent week (i.e. exposures with no lag period) were not considered to be biologically feasible, given the mean incubation period for Gd in people is 7–14 days. All statistical analysis was conducted using STATA 14.0 (STATA Corporation, College Station, TX).