‘Batteries’ in Machine Learning: A First Experimental Assessment of Inference for Siberian Crane Breeding Grounds in the Russian High Arctic Based on ‘Shaving’ 74 Predictors

  • Falk HuettmannEmail author
  • Chunrong Mi
  • Yumin Guo


The Siberian crane (Leucogeranus leucogeranus,) remains an elusive but highly regarded species of global conservation concern. Breeding regions occur in the Russian high arctic, and two subpopulations are known. Here we present for the first time a machine learning-based summer habitat analysis using nesting data for the eastern population in the breeding grounds employing predictive modeling with 74 GIS predictors. There is a typical desire for parsimony to help increase interpretability of models, but findings generally show that it would not result in greatest improvement to the model and inference. ‘Batteries’ are a new concept in machine learning allowing to test a set of experiments that help to test on predictors and model selection. Here we show 28 of those ‘batteries’ and compared multiple approaches to model runs from iteratively dropping the least or most important predictor (‘variable shaving’) to allow all predictors to contribute. It was found that the generic ‘kitchen sink’ model with TreeNet (an optimized boosting algorithm from Salford Systems Ltd) performs best. However, while the use of ‘batteries’ remain widely underused in wildlife conservation management, ‘shaving’ was of great use to learn about the structure, role and impacts of predictors and their spatial performance supporting non-parsimonious work. Of great interest is the finding that a bundle of low-ranked predictors performs almost equal to, or better than, the so-called top predictors. This is called ‘Predictor swapping’. This is the best and most detailed habitat study and prediction for the Siberian crane in summer, thus far. It is to be used for conservation management and as a generic template for any species while data availability and the environmental crisis are on the rise, specifically for the high Arctic.


Eastern population Siberian crane (Leucogeranus leucogeranusNesting areas Russian high arctic Machine learning Batteries (‘Shaving’) Predictor swapping 



We thank Dan Steinberg and Salford Systems Ltd. for a workshop with U.S. IALE at Snowbird, Utah, to introduce us to the power of batteries. FH acknowledges the kind and long collaboration with the Forestry University of Beijing, China, and the use of their data. U.S. IALE and S. Linke, C. Cambu, H. Hera, H. Berrios Alvarez and the -EWHALE lab- at UAF, are thanked for their support. This is EWHALE lab publication #185.


  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr AC-19:716–23, Institute of Statistical Mathematics, Minato-ku, Tokyo, JapanGoogle Scholar
  2. Arnold TW (2010) Uninformative parameters and model selection using Akaike’s information criterion. J Wildl Manag 74:1175–1178CrossRefGoogle Scholar
  3. Barbet-Massin M, Jiguet F, Albert CH, Thuiller W (2012) Selecting pseudo-absences for species distribution models: how, where and how many? Methods Ecol Evol, 3:327–338. CrossRefGoogle Scholar
  4. BirdLife International (2001) Threatened birds of Asia: the bird life international red data book, vol 1. Bird Life International Cambridge, CambridgeGoogle Scholar
  5. Breiman L (2001) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16:199–231CrossRefGoogle Scholar
  6. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca RatonGoogle Scholar
  7. Burnham KP, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach. Springer, New YorkGoogle Scholar
  8. Cai T, Huettmann F, Guo Y (2014) Using stochastic gradient boosting to infer stopover habitat selection and distribution of hooded cranes Grus monacha during spring migration in lindian, Northeast China. PLoS ONE 9.
  9. Chamberlin TC (1890) The method of multiple working hypotheses. Science 15:92–96Google Scholar
  10. Elith J, Graham CH, Anderson RP, Dudík M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann A, Li J, Lohmann LG, Loiselle BA, Manion G, Moritz C, Nakamura M, Nakazawa Y, McC J, Overton M, Townsend Peterson A, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón J, Williams S, Wisz MS, Zimmermann NE (2006) Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29:129–151CrossRefGoogle Scholar
  11. Fielding A (1999) Machine learning methods for ecological applications. Springer, BostonCrossRefGoogle Scholar
  12. Fielding A, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ Conserv 24:38–49CrossRefGoogle Scholar
  13. Friedman JH (2001) Greedy function approximation: A gradient boosting machine. Ann Stat 29:1189–1232CrossRefGoogle Scholar
  14. Friedman JH (2002) Stochastic gradient boosting. Comp Stat Data Anal 38:367–378CrossRefGoogle Scholar
  15. Guthery FS, Brennan LA, Peterson MJ, Lusk LL (2005) Information theory in wildlife science: critique and viewpoint. J Wildl Manag 69:457–465CrossRefGoogle Scholar
  16. Han X, Guo Y, Mi C, Huettmann F, Wen L (2017) Machine learning model analysis of breeding habitats for the Blacknecked Crane in Central Asian Uplands under Anthropogenic pressures. Scientific Reports 7, Article number: 6114.
  17. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, New YorkGoogle Scholar
  18. Herrick KA, Huettmann F, Lindgren MA (2013) A global model of avian influenza prediction in wild birds: The importance of northern regions. Vet Res. CrossRefGoogle Scholar
  19. Hilborn R, Mangel M (1997) The ecological detective: Confronting models with data. Princeton University Press, PrincetonGoogle Scholar
  20. Hochachka W, Caruana R, Fink D, Munson A, Riedewald M, Sorokina D, Kelling S (2007) Data mining for discovery of pattern and process in ecological systems. J Wildl Manag 71:2427–2437CrossRefGoogle Scholar
  21. Jiao S, Guo Y, Huettmann F, Lei G (2014) Nest-Site selection analysis of hooded crane (Grus monacha) in northeastern china based on a multivariate ensemble model. Zool Sci 31:430–437CrossRefGoogle Scholar
  22. Kandel K, Huettmann F, Suwal MK, Regmi GR, Nijman V, Nekaris KAI, Lama ST, Thapa A, Sharma HP, Subedi TR (2015) Rapid multi-nation distribution assessment of a charismatic conservation species using open access ensemble model GIS predictions: red panda (Ailurus fulgens) in the Hindu-Kush Himalaya region. Biol Conserv 181:150–161CrossRefGoogle Scholar
  23. Kanai Y, Ueta M, Germogenov N, Nagendran M, Mita N, Higuchi H (2002) Migration routes and important resting areas of Siberian cranes (Grus leucogeranus) between northeastern Siberia and China as revealed by satellite tracking. Biol Conserv 106:339–346CrossRefGoogle Scholar
  24. Klein DR, Magomedova M (2003) Industrial development and wildlife in arctic ecosystems: Can learning from the past lead to a brighter future? In: Rasmussen RO, Koroleva NE (eds) Social and environmental impacts in the North. Kluwer Academic Publishers, The Netherlands, pp 35–56Google Scholar
  25. Mace G, Cramer W, Diaz S, Faith DP, Larigauderie A, Le Prestre P, Palmer M, Perrings C, Scholes RJ, Walpole M, Walter BA, Watson JEM, Mooney HA (2010) Biodiversity targets after 2010. Env Sustain 2:3–8Google Scholar
  26. Manly FJ, McDonald LL, Thomas DL, McDonald TL, Erickson WP (2002) Resource selection by animals: statistical design and analysis for field studies, Second edn. Kluwer Academic Publishers, NetherlandsGoogle Scholar
  27. Matthiessen P (2001) The birds of heaven. Travels with cranes. North Point Press, New YorkGoogle Scholar
  28. McGarical K, Cushman S, Stafford S (2000) Multivariate statistics for wildlife and ecology research. Springer, New YorkCrossRefGoogle Scholar
  29. Mi C, Huettmann F, Guo Y, Han X, Wen L (2017) Why choose random forest to predict rare species distribution with few samples in large undersampled areas? Three Asian crane species models provide supporting evidence. PeerJ. CrossRefGoogle Scholar
  30. Moore GS, Ilyashenko E (2009) Regional flyway education programs: increasing public awareness of crane conservation along the crane flyways of Eurasia and North America. In: Prentice C (ed) Conservation of flyway wetlands in East and West/Central Asia. Proceedings of the project completion workshop of the UNEP/GEF Siberian Crane wetland project, 14–15 October 2009, Harbin, China. Baraboo (Wisconsin), USA: International Crane FoundationGoogle Scholar
  31. Mueller JP, Massaron L (2016) Machine learning for dummies. For Dummies Publisher, 435 pGoogle Scholar
  32. Ohse B, Huettmann F, Ickert-Bond S, Juday G (2009) Modeling the distribution of white spruce (Picea glauca) for Alaska with high accuracy: an open access role-model for predicting tree species in last remaining wilderness areas. Polar Biol 32:1717–1724CrossRefGoogle Scholar
  33. Prentice C (ed) (2010) Conservation of flyway wetlands in East and West/Central Asia. Proceedings of the project completion workshop of the UNEP/GEF Siberian Crane wetland project, 14–15 October 2009, Harbin, China. Baraboo (Wisconsin), USA: International Crane FoundationGoogle Scholar
  34. Sorokin AG, Kotyukov YV (1987) Discovery of the nesting ground of the Ob River population of the Siberian Crane. In: Archibald GW, Pasquier RF (eds) Proceedings of the 1983 international crane workshop. International Crane Foundation, Baraboo, pp 209–212Google Scholar
  35. Sorokin A, Markin Y (1996) New nesting site of Siberian Cranes. Newsletter of Russian Bird Conservation Union, MoscowGoogle Scholar
  36. Spiridonov V, Gavrilo M, Krasnov MA, Nikolaeva N, Sergienko L, Popov A, Krasnova E (2011) Toward the new role of marine and coastal protected areas in the arctic: The russian case. In: Huettmann F (ed) Protection of the three poles. Springer, New YorkGoogle Scholar
  37. Silvy NY (2012) The wildlife techniques manual: research and management, vol 2, 7th edn. John Hopkins University Press, BaltimoreGoogle Scholar
  38. Van Impe J (2013) Esquisse de l’avifaune de la Sibérie Occidentale: Une revue bibliographique. Alauda 81:269–296Google Scholar
  39. Wu G, Leeuw J, Skidmore AK, Prins HHT, Best EPH, Liu Y (2009) Will the three gorges dam affect the underwater light climate of Vallisneria spiralis L. and food habitat of Siberian Crane in Poyang Lake. Hydrobiologia 623:213–222CrossRefGoogle Scholar
  40. Yu C, Yinghao W, Qing Y (2008) Ground survey of waterbirds in the Poyang Lake region in Winter 2007/2008. Siberian Crane Flyway News: 15Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.EWHALE Lab, Biology and Wildlife Department, Institute of Arctic BiologyUniversity of Alaska-FairbanksFairbanksUSA
  2. 2.Institute of ZoologyChinese Academy of SciencesBeijingChina
  3. 3.College of Nature ConservationBeijing Forestry UniversityBeijingChina

Personalised recommendations