Background

Johne’s disease (JD) is a chronic, debilitating enteritis that affects ruminants and is caused by infection with Mycobacterium avium subsp. paratuberculosis (MAP) [1]. Financial losses due to JD in the U.S. dairy industry have been estimated to range between $200 to 250 million USD annually [2]. Reduced milk production and quality due to reduced fat and protein content, increased premature mortality, weight loss, early culling, costs of testing and control, and reduced slaughter value are among the negative impacts of JD [3,4,5,6]. In addition, although not confirmed, the potential link between MAP and the development of Crohn’s disease in humans further increases the hypothetical importance of JD [7]. A recent review highlighted that the global prevalence of JD is underestimated and setting objectives for surveillance and control measures is much needed [8].

The management and control of a chronic disease such as JD in a proactive and organized manner is challenging in the U.S. due to lack of regulatory requirements for testing [9], imperfect diagnostic tests [10], long-term survival of the pathogen outside the host [11], multiple routes of transmission, and the cost and labor necessary for long-term disease tracking [12].

JD is widespread in the U. S, and herd prevalence has been estimated as 60.7% in Midwestern U.S. dairies [13] and 91.1% nationally [14]. However, JD control is voluntary in the US and therefore testing for JD is not mandatory [9], which limits the availability of data and resources to monitor the disease. Studies suggest that limited adoption and compliance with JD testing and control strategies in dairy farms is a result of a) the chronic nature of the disease progression, therefore, the absence of the “cues-to-action” [15], b) the farmers’ perception of the limited cost-effectiveness of the herd control measures [16], and c) not perceiving JD as a “hot topic” during communications with other farmers and veterinarians [16].

Due to the lack of official disease monitoring, a common alternative for evaluating the epidemiological status of JD in a region is the use of data from voluntary testing programs, such as those collected by the Minnesota Dairy Herd Improvement Association (MNDHIA). Minnesota, a Midwestern state of the U.S., has nearly 460,000 dairy cattle and is among the top ten dairy states, 6th in terms of milk cow numbers and 8th in dairy herd numbers, per 2016 statistics available from the National Agricultural Statistics Services (NASS) [17, 18]. A proportion of Minnesota dairy farms utilize the services of MNDHIA, a member of the National Dairy Herd Improvement Association who provide a testing and production recordkeeping service to U.S. dairy farms (http://www.dhia.org/members.asp). However, it is unknown if data collected by the MNDHIA is useful as a passive surveillance tool to monitor JD in Minnesota.

The objectives of the study here were to a) test whether the voluntary JD testing program conducted by the MNDHIA can provide representative information on the prevalence of JD in dairy herds in Minnesota, b) estimate the JD distribution in Minnesota using data from the MNDHIA voluntary JD testing program, and c) identify possible herd and environmental variables associated with increased risk of having JD milk ELISA test-positive cows, using the available data. We hypothesized that results from the voluntary JD program might be used to evaluate JD status in an area and inform management decisions made by the testing agencies, veterinarians, and dairy producers. In evaluating the use of the MNDHIA database as a passive surveillance tool, our overarching objective was to generate evidence that could influence management decisions by recognizing modifiable factors to reduce the JD risk at the individual, herd, and regional levels. Results could therefore be useful in the design and implementation of surveillance programs for the U.S. dairy industry.

Results

Spatial representativeness

During the 2.5-year study period, 600/4746 (13%) dairy herds in Minnesota tested at least once for JD at MNDHIA laboratories, representing 18.7% (600/3210) of the licensed dairy herds in Minnesota with permits to ship milk for human consumption [17]. Figure 1 depicts the number of MNDHIA sampled dairy herds by county, the minimum required number of herds to estimate a true JD prevalence, and the number of dairy herds listed in the 2012 NASS Census of Agriculture (by terciles). We observed that the distribution of the MNDHIA participants included in this study mirrors the pattern of the milk cow herds included in the USDA NASS 2012 report (Fig. 1). Nevertheless, the minimum sample size required to estimate disease prevalence was not attained in any Minnesota county except Ramsey (where both the appropriate and observed sample sizes were 1).

Fig. 1
figure 1

The participation of study herds as a percentage of the ideal sample size, by county, is summarised with the graduated symbols. The Minnesota Dairy Herd Improvement Association testing laboratories are illustrated with triangles. The background colors in grey indicates the number of dairy herds in each county, based on 2012 census of the National Agricultural Statistics Service [18]. Map depicted here was generated as part of the current study

Descriptive statistics and spatial pattern recognition

The apparent herd-level prevalence of MAP, based on having 1+ cows with a positive milk ELISA, was 69% (414/600). The MNDHIA herds in this study included both small (< 100 cows, n = 332) and large (≥ 100 cows, n = 268) dairies. Figure 2 illustrates the Getis ord Gi* local test results for JD status and the covariates in the geographical space According to the Getis Ord Gi* analysis, we concluded that a certain risk of JD infection was present in herds throughout Minnesota, although herds in the southeastern corner were more likely to be JD milk ELISA test-positive compared to the herds in the northcentral region (Fig. 2: Panel a). The herd size of the study population showed a similar pattern where larger herds were located in the southeastern corner of the state, while smaller herds were in the northcentral region (Fig. 2: Panel b). We observed no spatial pattern in the testing frequency (Fig. 2: Panel c). Spatial patterns were observed for many of the remaining covariates: soil pH, soil type (texture), soil hydrologic characteristics (runoff potential), and agroecological characteristics (Fig. 2). The soil pH in Western Minnesota contained predominantly alkaline soils. However, the soil type (texture), hydrologic soil types, and agroecological characteristics demonstrated intricate patterns of spatial distribution. According to the Cramer’s V, none of the covariates were strongly associated with each other.

Fig. 2
figure 2

Spatial patterns of Johne’s disease status and the associated covariates. Points represent the location of study herds (n = 600). Panels (a, b, and c) illustrates the results of the Getis Ord Gi* local test where herds with high value of the variable next to herds with high values of the variable are represented in red (high-high clusters), herds with low value of the variable next to herds with low values of the variable represented in blue (low-low clusters), and no-matching pairs in yellow (high-low, or low-high values) [19, 20]. Covariates depicted in panels d though h include: (d) soil pH; (e) soil type/texture; (f) hydrologic soil type (i.e. runoff); (g) Agro-ecological characteristics; and (h) participants of the Voluntary Johne’s Herd Status Program for Cattle program (VJDHSP*). Maps depicted here were generated as part of the current study

Classification of JD status

Based on the appropriate number, i.e. minimum sample size per herd (Table 2) criterion, we observed 437/600 (72.83%) herds that had tested an appropriate number of cows during the study period. Of those 437 herds, 186 herds (31%) had no test-positive cows. As mentioned above, all other herds had at least one positive cow (414/600; 69%). Although the study presented here considered the entire study period, the suggested minimum sample sizes listed in Table 2 are the number of cows to be tested at each testing cycle.

Regression results

Herd size, testing frequency, and soil type (texture) were retained in the final multivariable model (Table 3). Herd turnover rates were available only from 454/600 (76%) herds and the association between JD status and herd turnover rate was not statistically significant in the univariate analysis (Table 3), therefore, excluded from the final multivariable model. The Moran’s I and Getis Ord Gi* statistics indicated that there was no spatial autocorrelation in the regression residuals (p-value > 0.05). Similarly, results of the fitted CAR model suggested that the spatial dependence for dairy herds located between 5 and 20 km was not significant after adjusting for the covariates, and its AIC value was higher than that of the model without the spatial component (Additional file 2: Table S2). The spatial correlation parameter of the CAR model (λ) was not significantly different from zero for any of the distance thresholds considered (Table 3).

Discussion

This study suggests the use of an existing voluntary testing program as a passive surveillance system to track JD in Minnesota. We demonstrated that the MNDHIA voluntary testing program may be a useful source to investigate the JD status of Minnesota dairies, with program participants in 2014–2017 representing 13% of the dairy herds across Minnesota and coming from areas with both high and low density of dairy farms. Even though county-level JD herd prevalence could not be reliably estimated because the required sample sizes were not achieved in most counties, we were able to estimate the herd-level JD status for the study area as 69%, as 414 of the 600 herds had at least one cow tested-positive for JD milk ELISA. As per the epidemiological factor analysis, the most important epidemiological factors contributing to the JD status of a herd were herd size, testing frequency, and soil type, i.e. texture. We did not observe spatial dependence of the residuals of the regression model indicating that the observation of similar characteristics in JD status in the participant dairy herds were explained by the three covariates, namely, herd size, testing frequency, and soil type/texture. These results will be used to inform the potential use of the database as a surveillance tool and to suggest improvements in JD testing program conducted by MNDHIA.

JD positive herds were distributed throughout Minnesota, although herds in the southern region were more likely to be JD milk ELISA test-positive compared to herds in the north-central region (Fig. 2: Panel a). Interestingly, the herd size of the study population showed a comparable pattern, with larger herds more likely in the southern region compared to smaller herds in the north-central region of the state (Fig. 2: Panel b). A similar observation where herd size was not evenly distributed in space, with larger herds being preferentially distributed in certain areas, was also found among participants in a Danish JD control program [21].

The herds located on silt, loam, or clay soils were more likely to be JD positive compared to the herds located on sandy soils. This observation contradicts early studies in the Midwest [22], which found that high silt content was associated with reduced detection of JD. However, studies by Dhand et al. [23] and Salgado et al. [24] described the potentially higher likelihood of detecting JD on loamy and clay soils, respectively. This survival of MAP on loamy soil was experimentally observed by Salgado et al. [24], which concluded that MAP tends to migrate slower through loamy soils compared to sandy soils and thus loamy soils may have MAP contaminated upper soil layers and pasture. Furthermore, according to Salgado et al. [24], in addition to soil type, the amount of rainfall and the soil pH also play an important role in the fate of MAP in the environment. Soil hydrologic characteristics and pH were not included in the final model, but it is worthwhile to acknowledge that soil texture, pH, and hydrologic characteristics are interconnected [25] and further analysis is needed to understand the association of soil features and the persistence of MAP in manure contaminated environments [26].

Some of the limitations associated with this study include the limited generalizability of the results because MNDHIA data represents a purposive, non-random proportion of dairy herds in Minnesota. Moreover, the lack of information on herd characteristics other than herd size and testing frequency is another limitation that should be considered when interpreting the results. A more insightful interpretation requires herd management details including maternity pen management, which cows were chosen for testing for JD (high-risk cows or cows that were being sold), and the management decisions towards JD positive cows [27]. Although the long-term survival of MAP in loamy or clay soils was suggested in other studies [23], we acknowledge that additional information on the access of cattle to pastures would be necessary to establish a causal relationship between soil type and MAP exposure. Other potential routes of exposure through forages originated from other locations or grazing on pasture lands in different geographical areas such as rented pasture lands located elsewhere would certainly affect this interpretation. When data arise from imperfect surveillance systems, the interpretation of results must be done with caution because the covariates can be related either to the occurrence of the disease or to the efficiency of the data collection system [28].

Although MNDHIA does not currently offer a control program, through this study we recognized opportunities to improve the MNDHIA database to be used as a passive surveillance tool. These opportunities include: the determination of the number of herds to be sampled to establish the prevalence of JD in Minnesota, the determination of the number of animals to be sampled from each herd to ensure a reliable evaluation of its JD status (positive/negative), ensuring that farmers provide accurate farm location information, and regularly collecting herd-level information on other relevant factors such as biosecurity measures to facilitate a better assessment of JD status/risk. Moreover, having the disease and underlying factors collected frequently over the time would facilitate conducting spatiotemporal analysis and enable facilitate making temporal confidence of “disease free zones” instead of static estimates generated through a cross-sectional analysis.

A JD surveillance program would be costly to establish and a voluntary testing database could be preferred for monitoring endemic pathogens causing chronic diseases like MAP. Thus, the strength of using MNDHIA data as a source is that there are 16National Dairy Herd Improvement (DHI) Association laboratories across the U.S. (http://www.dhia.org/members.asp) and their network already acts as a record keeping system for dairy farms. Having a system to evaluate the JD status in a region would benefit the dairy industry in multiple ways, such as recognizing the differences between participant and non-participant dairy farms in the voluntary testing programs, understanding underlying risk factors and covariates in the neighborhoods, and eventually recognizing disease-free areas [6]. Moreover, records available from DHI and the Council of Dairy Cattle Breeding (https://queries.uscdcb.com/) together would have been a useful system to trace the movements/transfers or termination of individual cows, which could facilitate further investigations of how cattle movement play a role in the JD transmission. However, it is important to recognize that the choice to test for JD may not necessarily overlap temporally to make a causal association between cow transfers and JD.

Participation in voluntary testing and control programs varies due to multiple factors such as: a) farmers’ belief in the importance of JD [16], b) farmers’ belief in control and prevention strategies including the investment of time and resources [16], and c) availability of the testing facilities and trained personnel to conduct testing at convenience [29, 30]. An examination of the reasons why dairy farmers choose to test, or not, for JD status exceeded the scope of this study, although such assessment may be of interest in assessing the value of voluntary testing and control programs. This study elucidates the evaluation of JD at herd and regional level using available data. The individual-level data analysis of the same dataset was presented elsewhere [31].

Conclusion

In summary, results reported here suggest that a routinely generated database obtained from a voluntary testing program can be used as a passive surveillance tool to monitor the infection status and epidemiological determinants of JD in a region. However, because the risk of introduction may always be present, successful prevention and control of JD depends on ongoing willingness to continue funding surveillance and research on the disease by both animal health authorities and the support from the community and stakeholders of livestock industry [32, 33].

Methods

Data

Data from the voluntary JD testing program conducted by the MNDHIA were collected through a 2.5-year period, between November 1st, 2014 and April 30th, 2017. Although there were records from 723 JD testing herds, 123 herds were excluded from the study (Additional file 1: Figure S1). The reasons for exclusions were: herds located outside Minnesota; herds without location information, or herds that did not test the minimum number of cows required for the study, as explained below.

Milk samples (n = 70,809) collected from 54,652 unique cows from 600 Minnesota dairy herds were included in the study. Samples were analyzed using the IDEXX MAP Enzyme-Linked Immunosorbent Assay (ELISA) (IDEXX Laboratories, Inc., Maine, USA) for detection of antibodies against MAP in milk. Some herds (208/600; 35%) were tested only once during the 2.5-year study period. The median number of times a herd was tested was 2 (interquartile range between 1 and 11).

To identify relevant herd and environmental factors associated with JD risk at the herd level, the scientific databases ‘Web of Science’ and ‘PubMed’ were queried to find publications using the following search string: ‘Johne’s disease,’ AND ‘Mycobacterium avium subsp. paratuberculosis,’ AND ‘dairy cattle,’ AND ‘risk factors’. Reviews published in peer-reviewed journals in English language were selected, and reviews on human Crohn’s disease were excluded. A total of seven reviews published between 2001 and 2017 on JD [32, 34,35,36,37,38,39] were used to identify the most commonly recognized JD risk factors in North American dairy farms. Reviews were then examined for identification of relevant risk factors for North America, and primary articles cited were used for identification of the variables to be considered in this study (Table 1). Due to unavailability of relevant data in our dataset, the following herd features were excluded from analysis: manure management, immediate culling of JD positive cows, management of the maternity pens and calves, and maintaining closed herd or purchasing animals from farms with improved management practices to control JD [12].

Table 1 Herd demographic factors and environmental factors associated with JD in North American dairy cattle, according to the published literature
Table 2 Minimum sample sizes required to estimate freedom from Johne’s Disease (JD) at the herd level, using an imperfect test and adjusting for a finite population calculated using the AusVet EpiTool Epidemiological calculator (URL: http://epitools.ausvet.com.au)
Table 3 Odds ratios, coefficients, and p-values of the association between epidemiological factors and herd-level Johne’s disease status, based on ELISA assays performed on individual milk samples in 600 herds in Minnesota

According to the published literature, herd size, testing frequency, and geographical region are associated with JD [21, 29]. Information on those three variables was extracted from the MNDHIA database. Data on herd size and herd turnover rates (454/600 herds) were available in the form of snapshots at the beginning of 2014, 2015, 2016, and 2017. Herd turnover rate calculations were performed by MNDHIA, based on the recommendations by Fetrow et al. [45]. Both herd sizes and herd turnover rates were averaged across the years for this cross-sectional analysis. Farm addresses were verified and geocoded using ArcMap version 10.3.4 [46]. Because the spatial dependence of JD risk has been described for neighboring farms [44], the possible existence of a spatial pattern in the risk of JD for neighboring farms located within 5 through 20 km was accounted for in the analyses.

The voluntary participation in JD control programs has also been described as a factor associated with JD status in a farm [30, 47]. In the absence of data from a control program, information on whether farms included in this study were currently participating in the Voluntary Johne’s Disease Herd Status Program for Cattle program (VJDSHP) was retrieved from the Minnesota Board of Animal Health [9, 29]. VJDSHP was introduced by USDA APHIS in 2002 as a gradual process which allows to recognize herds with low JD prevalence or free from the disease [48, 49]. As of 2017, among the study population, 24/600 (4%) herds were part of the Voluntary Johne’s Disease Herd Status Program (VJDHSP).

Layers of soil pH and soil type in Minnesota were obtained from the Natural Resource Conservation Services of the United States Department of Agriculture (USDA) [25, 50, 51]. The hydrologic soils data were used to estimate the runoff potential of the soils, using the Hydrologic Soil map available from the Web Soil Survey [52]. Because of the scarcity of accurate data, soil iron content was not considered in the study. Information on agroecological biome features such as grassland, shrubland, forest, and cropland were obtained from the Minnesota Geospatial Commons (https://gisdata.mn.gov/), which was based on remote sensing data creating the Land Cover Data Portal under the National Gap Analysis Project (GAP) [53]. The Minnesota GAP classification level 2 land-use/land-cover class code of the GAP data layer was used in the analysis (Additional file 2: Table S1).

Data analysis

Representativeness of MNDHIA herds

Minnesota counties were classified into terciles based on the number of NASS dairy herds present (1 to 12, 13 to 42, and > 43 dairy herds per county). Counties without records for milk cow herds in the NASS 2012 statistics were excluded for this calculation (n = 1; Cook county). To evaluate whether the study herds were representative of all Minnesota dairy herds, the number of study herds per county (tested for JD) was compared to the appropriate sample size, i.e. number of dairy herds to be sampled from each county to estimate the true herd prevalence of JD using an imperfect test and adjusting for a finite population (calculated using the AusVet EpiTool Epidemiological calculator (URL: http://epitools.ausvet.com.au) [54, 55]. Total number of herds present in each county was extracted from the National Agricultural Statistical Services (NASS) 2012 Census of dairy herds [18]. In addition, sample size calculations assumed an expected true herd prevalence of JD of 60% [13], a desired herd-level sensitivity of 70% and herd-level specificity of 70% [56], a precision for the estimate of +/− 10% and a level of confidence of 80%. The number of herds included in the MNDHIA database was then compared to the sample size required to accurately estimate prevalence.

Descriptive statistics and spatial pattern recognition

The apparent JD prevalence, the spatial distribution of JD milk ELISA test-positive farms, and the presence of spatial autocorrelation in the risk of JD and in other covariates considered were visualized and, for the latter, estimated using Morans’ I and Getis Ord Gi* statistics [19, 20, 57]. Morans’ I statistics measures the overall spatial autocorrelation of the herds based on both locations and value of the variable simultaneously [57]. The Getis Ord Gi* recognizes areas where the local sum of values for a given variable significantly differ from the expected location sum [19, 20]. This statistic identifies herds with high value of the variable next to herds with high values of the variable (high-high clusters), herds with low value of the variable next to herds with low values of the variable (low-low clusters), and no-matching pairs (high-low, or low-high values). Categorical variables, not suited for Getis Ord Gi* analysis, were mapped for visualization (Fig. 2).

Classification of JD status

The output variable (herd JD status) was dichotomized as follows: 1) ‘Negative herds’ in which an appropriate number of unique cows were tested during the study period and were all negative (see below definition of appropriate number) and 2) ‘Positive herds’ in which at least one cow was tested positive on the milk ELISA test during the entire study period. The appropriate sample size to certify disease freedom in a herd, i.e. minimum required sample size per herd, using an imperfect test and adjusting for a finite population, was set for each herd using the AusVet EpiTool Epidemiological calculator (URL: http://epitools.ausvet.com.au) [54, 55]. Assumptions for the calculations were a) design prevalence, i.e. expected within-herd prevalence when infected, of 10% [58], and b) sensitivity of the diagnostic test of 52% (IDEXX Laboratories Inc., Maine, USA). Herds that tested fewer cows than the appropriate sample size and that had no positive cows were excluded from the analysis because their apparent JD disease freedom/negative state could not be reliably demonstrated.

Regression analysis

The outcome variable used in all models was the farm-level JD milk ELISA test-status (positive, negative). The herd size, which ranged between 1 and 1929 cows, was categorized using Jenks natural breaks method [59]. Similarly, herd turnover rate was categorized into three based on Jenks natural breaks. The testing frequency per herd during the 2.5 years varied between once and 30 times, and was categorized into four classes (1, 2 to 5, 6 to 20, and > 20). Soil pH values ranged between 5.6 and 7.5, and were categorized into three (< 6.0, 6.0 to 7.0, and > 7.0) groups. The soil type/texture was re-categorized into four classes based on the percentage of different types of particles as clay (> 50% clay and ≤ 50% silt), sand (> 50% sand and ≤ 50% clay), silt (> 50% silt and ≤ 50% sand), and loam (equal proportions of sand, silt and clay, i.e. ≤50% sand, ≤50% clay, and ≤ 50% silt), following the soil texture triangle model described elsewhere [60, 61]. Soil hydrologic characteristics were summarized into four categories: 1) Type A with low runoff potential, 2) Type B with moderately low runoff potential, 3) Type C with moderately high runoff potential, and 4) Type D with high runoff potential, when completely wet [51, 52, 61]. Agroecological biome features included four categories, namely, crop/grassland, non-vegetated land, shrubland, and deciduous forest (Additional file 2: Table S1). The VJDHSP participation status was incorporated into the model as a dichotomous variable (current participant vs. non-participant herds).

To avoid multicollinearity, a simple logistic regression was used to assess the marginal association between JD status and the covariates. The strength of the association between pairs of covariates was analyzed using Chi-square test followed by a Cramer’s V test. Variables with p-value < 0.2 and that were not significantly associated among them (i.e., Chi-square p-values > 0.05 and Cramer’s V > 0.5) were tested as candidate variables in the full multivariable logistic regression model. To prevent overfitting, the full model, including all possible 2-way interactions deemed biologically plausible, was subjected to backward stepwise regression based on the lowest Akaike Information Criterion (AIC) until the most parsimonious (final) model was fitted. The regression analysis was carried out using the R Statistical Software (Foundation for Statistical Computing, Vienna, Austria). For the descriptive analysis and logistic regression, we used R packages ‘Base’ [62] and ‘MASS’ [63].

Evidence of spatial autocorrelation in the residuals of the final regression model was assessed using the global Morans I and local Getis Ord Gi* tests. To account for potential spatial dependence in the outcome variable, a proper conditional autoregressive model (CAR) structure was included in the model [64] using the “spdep” R package [65, 66]. Distances of 1 km, 5 km, 10Km, 15 km, 20 km and the minimum distance between herds that guaranteed all herds had at least one neighbor were tested alternatively to define the neighborhood matrix in the CAR [67]. The isolated herds without a neighbor at each distance thresholds were assigned a lag value of zero at each model fit. The AIC values and significance of the spatial correlation parameter of the CAR model (λ) from the regression models with and without the CAR model structure were compared to select the best model [67].