Environmental and Ecological Statistics

, Volume 22, Issue 2, pp 207–226 | Cite as

A Bayesian hurdle model for analysis of an insect resistance monitoring database

  • Matthew G. Falk
  • Rebecca O’Leary
  • Manoj Nayak
  • Patrick Collins
  • Samantha Low Choy


Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models.


Bayesian classification trees Generalized additive models  Hurdle model Insect resistance Phosphine 



The authors would like to thank Dr Clair Alston for the very helpful comments on the manuscript. Drs Falk, Nayak, Low Choy and Collins would also like to acknowledge the support of the Australian Governments Cooperative Research Centres Program.


  1. Austin MP, Meyers JA (1996) Current approaches to modelling the environmental niche of eucalypts: implication for management of forest biodiversity. For Ecol Manag 85(1–3):95–106CrossRefGoogle Scholar
  2. Austin MP, Nicholls AO, Doherty MD, Meyers JA (1994) Determining species response functions to an environmental gradient by means of a \(\beta \) function. J Veg Sci 5(2):215–228CrossRefGoogle Scholar
  3. Berk R, Brown L, Zhao L (2010) Statistical inference after model selection. J Quant Criminol 26(2):217–236CrossRefGoogle Scholar
  4. Bonn A, Schröder B (2001) Habitat models and their transfer for single and multi species groups: a case study of carabids in an alluvial forest. Ecography 24(4):483–496CrossRefGoogle Scholar
  5. Chipman HA, George EI, McCulloch RE (1998) Bayesian CART model search. J Am Stat Assoc 93(443):935–948CrossRefGoogle Scholar
  6. Collins PJ (2006) Resistance to chemical treatments in insect pests of stored grain and its management. In: Lorini I, Bacaltchuk B, Beckel H, Deckers D, Sundfeld E, dos Santos JP, Biagi JD, Celaro JC, Faroni LRDA, Bortolini L.de OF, Sartori MR, Elia MC, Guedes RNC, da Fonseca RG, Scussel VM (eds) Proceedings of the 9th international working conference on stored product protection, Campinas, Brazil (2006)Google Scholar
  7. Collins PJ, Emery RN, Wallbank BE (2003) Resistance to chemical treatments in insect pests of stored grain and its management. In: Bell CH, Cogan PM, Highley E, Credland PF, Armitage DM (eds) Proceedings of the 8th international working conference on stored product protection. York, UKGoogle Scholar
  8. Dalrymple ML, Hudson IL, Ford RPK (2003) Finite mixture, zero-inflated poisson and hurdle models with application to sids. Comput Stat Data Anal 41(3–4):491–504CrossRefGoogle Scholar
  9. De’ath G, Fabricius KE (2000) Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11):3178–3192CrossRefGoogle Scholar
  10. Denison DGT, Mallick BK, Smith AFM (1998) A Bayesian CART algorithm. Biometrika 85(2):363–377CrossRefGoogle Scholar
  11. Emery RN, Nayak MK, Holloway JC (2011) Lessons learned from phosphine resistance monitoring in Australia. Postharvest Rev 3(6):1–8CrossRefGoogle Scholar
  12. Emery RN, Tassone RA (1998) The Australian Grain Insect Resistance Database (AGIRD)—a national approach to resistance data management. In: Banks HJ, Wright EJ, Damcevski KA (eds) Proceedings of Australian postharvest technical conference. Canberra, AustraliaGoogle Scholar
  13. Fletcher D, MacKenzie D, Villouta E (2005) Modelling skewed data with many zeros: a simple approach combining ordinary and logistic regression. Environ Ecol Stat 12:45–54CrossRefGoogle Scholar
  14. Frühwirth-Schnatter S, Wagner H (2010) Bayesian variable selection for random intercept modelling of Gaussian and non-Gaussian data. In: Bernardo M, Bayarri MJ, Berger JO, Dawid AP, Heckerman D, Smith AFM, West M (eds) Bayesian statistics 9. Canberra, AustraliaGoogle Scholar
  15. George EI, McCulloch RE (1993) Variable selection via gibbs sampling. J Am Stat Assoc 88(423):881–889CrossRefGoogle Scholar
  16. Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–732CrossRefGoogle Scholar
  17. Hastie T, Tibshirani R (1986) Generalized additive models. Stat Sci 1(3):297–310CrossRefGoogle Scholar
  18. Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. Springer, New YorkGoogle Scholar
  19. Hu W, O’Leary R, Mengersen K, Low Choy S (2011) Bayesian classification and regression trees for predicting incidence of cryptosporidiosis. PLoS One 6(8):e23903CrossRefPubMedCentralPubMedGoogle Scholar
  20. Ishwaran H, Rao JS (2005) Spike and slab variable selection: frequentist and Bayesian strategies. Ann Stat 33(2):730–773CrossRefGoogle Scholar
  21. Martin TG, Wintle BA, Rhodes JR, Kuhnert PM, Field SA, Low-Choy SJ, Tyre AJ, Possingham HP (2005) Zero tolerance ecology: improving ecological inference by modelling the source of zero observations. Ecol Lett 8(11):1235–1246CrossRefPubMedGoogle Scholar
  22. Miller J, Franklin J (2002) Modeling the distribution of four vegetation alliances using generalized linear models and classification trees with spatial dependence. Ecol Model 157(2–3):227–247CrossRefGoogle Scholar
  23. Mullahy J (1986) Specification and testing of some modified count data models. J Econ 33(3):341–365CrossRefGoogle Scholar
  24. Nayak MK, Collins PJ, Holloway JC, Emery RN, Pavic H, Bartlett J (2013) Strong resistance to phosphine in rusty grain beetle, Cryptolestes ferrugineus (stephens) (coleoptera: Laemophloeidae): its characterisation and a rapid assay for diagnosis. Pest Manag Sci 69:48–53CrossRefPubMedGoogle Scholar
  25. O’Leary, R (2008) Informed statistical modelling of habitat suitability for rare and threatened species. PhD thesis. Queensland University of Technology, BrisbaneGoogle Scholar
  26. O’Leary R, Low Choy S, Mengersen K (2012) Improving the performance and interpretation of habitat models: a two-scale modelling approach to model the envelope and identify excess zeros. Under Rev 1:1Google Scholar
  27. O’Leary R, Mengersen K, Murray J (2009) Comparison of four expert elicitation methods: for Bayesian logistic regression and classification trees. In: 18th World IMACS/MODSIM congressGoogle Scholar
  28. Scheipl F (2011) spikeSlabGAM: Bayesian variable selection, model choice and regularization for generalized additive mixed models in R. J Stat Softw 43(14):1–24Google Scholar
  29. Therneau TM, Atkinson EJ (1997) An introduction to recursive partitioning using the RPART routine. Technical report. Mayo ClinicGoogle Scholar
  30. Therneau TM, Atkinson, EJ (2011) rpart: recursive partitioning, R package version 3.1-50Google Scholar
  31. Welsh AH, Cunningham RB, Donnelly CF, Lindenmayer, DB (1996) Modelling the abundance of rare species: statistical models for counts with extra zeros. Ecol Model 88(1–3):297–308. ISSN 0304–3800Google Scholar
  32. Wood SN (2006) Generalized additive models: an introduction with R. Chapman and Hall/CRC, LondonGoogle Scholar
  33. Zhang P (1992) Inference after variable selection in linear regression models. Biometrika 79(4):741–746CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Matthew G. Falk
    • 1
    • 2
  • Rebecca O’Leary
    • 3
  • Manoj Nayak
    • 2
    • 4
  • Patrick Collins
    • 2
    • 4
  • Samantha Low Choy
    • 1
    • 2
  1. 1.Mathematical SciencesQueensland University of TechnologyBrisbaneAustralia
  2. 2.Plant Biosecurity Cooperative Research CentreBruceAustralia
  3. 3.Department of Agriculture and Food, Western AustraliaSouth PerthAustralia
  4. 4.Department of Agriculture, Fisheries and Forestry, QueenslandBrisbaneAustralia

Personalised recommendations