Modeling the establishment of invasive species: habitat and biotic interactions influencing the establishment of Bythotrephes longimanus
Bythotrephes longimanus is an invasive pelagic crustacean, which first arrived in North America from Europe in early 1980s and can now be found throughout the Great Lakes and in many inland lakes and waterways. Determining the suitability of lakes to Bythotrephes establishment is an important step in quantifying its potential habitat range and environmental risk. Lake environmental conditions, planktivorous fishes, sport fishes and Bythotrephes occurrence data from 179 south-central Ontario lakes were used in this study to model lake characteristics suitable for its establishment. The performance of principal component analysis and different predictive models was used to determine the habitats that are suitable for the survival of Bythotrephes and the factors that may regulate its spread. Four modeling approaches were employed: linear discriminant analysis; multiple logistic regression; random forests; and, artificial neural networks. Ensemble prediction based on the four modeling approaches was also used as an indicator for predicting Bythotrephes occurrence. Bythotrephes appears to establish more readily in larger, deeper lakes with lower elevation, that have more sport fishes. Bythotrephes occurrence can be best predicted by artificial neural networks when including the measures of fish data, in addition to lake environmental data. Lake elevation, surface area and sport fish occurrence were ranked as the most important predictors of Bythotrephes invasion. The inclusion of biotic variables (occurrence or diversity of sport or planktivorous fishes) enhanced cross-validated models relative to analyses based on environmental data alone.
KeywordsArtificial neural networks Bythotrephes longimanus Ensemble prediction Invasive species Linear discriminant analysis Multiple logistic regression Phi correlation coefficient Principal component analysis Random forests
We would like to thank NSERC, CAISN and the various research funding partners for facilitating this research. We thank Norman Yan for leading the field sampling program, for providing the data, and for comments on earlier presentations of this work. We also thank the Ontario Ministry of Natural Resources for providing fish composition data.
- Carlton JT (1985) Transoceanic and interoceanic dispersal of coastal marine organisms: the biology of ballast water. Oceanogr Mar Biol Annu Rev 23:313–371Google Scholar
- Chatfield C (1995) Model uncertainty, data mining and statistical inference. J R Stat Soc Ser A Gen 158:419–466Google Scholar
- Elith J, Graham CH, Anderson RP, Dudík M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz C, Nakamura M, Nakazawa Y, Overton JMcC, Peterson AT, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón J, Williams S, Wisz MS, Zimmermann NE (2006) Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29:129–151CrossRefGoogle Scholar
- Iverson LR, Prasad AM, Liaw A (2004) New machine learning tools for predictive vegetation mapping after climate change: bagging and random forest perform better than regression tree analysis. In: Smithers R (ed) Landscape ecology of trees and forests. Proceedings of the 12th Annual Conference of the IALE (UK). International Association for Landscape Ecology, UK, pp 317–320Google Scholar
- Jackson DA (2002) Ecological impacts of Micropterus introductions: the dark side of black bass. In: Phillip D, Ridgway M (eds) Black bass: ecology, conservation and management. American Fisheries Society, Bethesda, pp 221–234Google Scholar
- MacIsaac HJ, Ketelaars HAM, Grigorovich I, Ramcharan C, Yan ND (2000) Modeling Bythotrephes longimanus invasions in the Great Lakes basin based on its European distribution. Arch Hydrobiol. 149:1–21Google Scholar
- McCune B, Grace JB (2002) Analysis of ecological communities. MjM software design. Gleneden Beach, OregonGoogle Scholar
- Olden JD, Jackson DA (2000) Torturing the data for the sake of generality: how valid are our regression models? Ecoscience 7:501–510Google Scholar
- R Development Core Team (2010) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org