Skip to main content
Log in

Focused identification of germplasm strategy (FIGS) detects wheat stem rust resistance linked to environmental variables

  • Research Article
  • Published:
Genetic Resources and Crop Evolution Aims and scope Submit manuscript

Abstract

Recent studies have shown that novel genetic variation for resistance to pests and diseases can be detected in plant genetic resources originating from locations with an environmental profile similar to the collection sites of a reference set of accessions with known resistance, based on the Focused Identification of Germplasm Strategy (FIGS) approach. FIGS combines both the development of a priori information based on the quantification of the trait-environment relationship and the use of this information to define a best bet subset of accessions with a higher probability of containing new variation for the sought after trait(s). The present study investigates the development strategy of the a priori information using different modeling techniques including learning-based techniques as a follow up to previous work where parametric approaches were used to quantify the stem rust resistance and climate variables relationship. The results show that the predictive power, derived from the accuracy parameters and cross-validation, varies depending on whether the models are based on linear or non-linear approaches. The prediction based on learning techniques are relatively higher indicating that the non-linear approaches, in particular support vector machine and neural networks, outperform both principal component logistic regression and generalized partial least squares. Overall there are indications that the trait distribution of resistance to stem rust is confined to certain environments or areas, whereas the susceptible types appear to be limited to other areas with some degree of overlapping of the two classes. The results also point to a number of issues to consider for improving the predictive performance of the models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Abbreviations

AUC:

Area under the ROC curve

GPLS:

Generalized partial least squares

GIS:

Geographic information systems

NN:

Neural networks

PCA:

Principal component analysis

PCLR:

Principal component logistic regression

PLS:

Partial least squares

RF:

Random forest

ROC:

Receiver operating characteristics

SVM:

Support vector machine

References

  • Abdi H (2010) Partial least squares regression and projection on latent structure regression (PLS regression). Wiley Interdiscip Rev Comput Stat 2(1):97–106. doi:10.1002/wics.51

    Article  Google Scholar 

  • Aguilera AM, Escabias M, Valderrama MJ (2006) Using principal components for estimating logistic regression with high-dimensional multicollinear data. Comput Stat Data Anal 50:1905–1924

    Article  Google Scholar 

  • Arif S, Adams DC, Wicknick JA (2007) Bioclimatic modeling, morphology, and behavior reveal alternative mechanisms regulating the distributions of two parapatric salamander species. Evol Ecol Res 9:843–854

    Google Scholar 

  • Barboni D, Harrison SP, Bartlein PJ, Jalut G, New M, Prentice IC, Sanchez-Goñi M-F, Spessa A, Davis B, Stevenson AC (2004) Relationships between plant traits and climate in the Mediterranean region: a pollen data analysis. J Veg Sci 15:635–646

    Article  Google Scholar 

  • Bari A, Martin A, Boulouha B, Barranco D, Gonzalez-Andujar JL, Trujillo I, Ayad G (2003) Image feature extraction combined with a neural networks approach for the identification of olive cultivars. In: Proceeding of the 3rd IASTED international conference on visualization, imaging and image processing, pp 613–620. ACTA Press

  • Bastien P, Vinzi VE, Tenenhaus M (2005) PLS generalized linear regression. Comput Stat Data Anal 48(1):17–46

    Article  Google Scholar 

  • Belsley DA (1991) A guide to using the collinearity diagnostics. Comput Sci Econ Manag 4:33–50

    Google Scholar 

  • Bhullar NK, Zhang Z, Wicker T, Keller B (2009) Wheat gene bank accessions as a source of new alleles of the powdery mildew resistance gene Pm3: a large scale allele mining project. BMC Plant Biol 10:88. doi:10.1186/1471-2229-10-88

    Article  PubMed  Google Scholar 

  • Bonman JM, Bockelman HE, Jackson LF, Steffenson BJ (2005) Disease and insect resistance in cultivated barley accessions from the USDA national small grains collection. Crop Sci 45:1271–1280

    Article  Google Scholar 

  • Bonman JM, Bockelman HE, Jin Y, Hijmans RJ, Gironella A (2007) Geographic distribution of stem rust resistance in wheat landraces. Crop Sci 47:1955–1963

    Article  Google Scholar 

  • Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Series B (Methodol) 26(2):211–252

    Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Brown AHD, Spillane C (1999) Implementing core collections principles, procedures, progress, problems and promise. In: Johnson RC, Hodgkin T (eds) Core collections for today and tomorrow. International Plant Genetic Resources Institute, Rome, pp 1–9

    Google Scholar 

  • Chuine I (2010) Why does phenology drive species distribution? Phil Trans R Soc B 365:3149–3160

    Article  PubMed  Google Scholar 

  • CIMMYT (2005) Sounding the alarm on global stem rust. An Assessment of race ug99 in Kenya and Ethiopia and the potential for impact in neighbouring regions and beyond. Resource Document. Accessed 17 Feb 2011. http://www.globalrust.org/db/attachments/about/2/1/Sounding%20the%20Alarm%20on%20Global%20Stem%20Rust.pdf

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. doi:10.1007/BF00994018

    Google Scholar 

  • Cushman SA, McGarigal K (2004) Patterns in the species-environment relationship depend on both scale and choice of response variable. Oikos 105:117–124

    Article  Google Scholar 

  • Cutler DR, Edwards TC Jr., Beard KH, Cutler A, Hess KT, Gibson J, Lawler JJ (2007) Random Forests for classification in ecology. Ecology 88:2783–2792

    Article  Google Scholar 

  • De Pauw E, Goebel W, Adam H (2000) Agrometeorological aspects of agriculture and forestry in the arid zones. Agric Forest Meteorol 103:43–58

    Article  Google Scholar 

  • Dimitriadou E, Hornik K, Leisch F, Meyer D, Weingessel A (2010) R library (e1071). The R foundation for statistical computing. ISBN: 3-900051-07-0

  • Ding BY, Gentleman R (2005) Classification using generalized partial least squares. J Comput Graphical Stat 14(2):280–298

    Article  Google Scholar 

  • Dinoor A (1975) Evaluation of sources of resistance. In: Frankel OH, Hawkes JD (eds) Crop genetic resources for today and tomorrow. Cambridge University Press, Cambridge, pp 201–210

    Google Scholar 

  • Drake JM, Randin C, Guisan A (2006) Modelling ecological niches with support vector machines. J Appl Ecol 43:424–432

    Article  Google Scholar 

  • Dwivedi SL, Crouch JH, Mackill DJ, Xu Y, Blair MW, Ragot M, Upadhyaya HD, Ortiz R (2007) The molecularization of public sector crop breeding: progress, problems, and prospects. Adv Agron 95:163–318

    Article  CAS  Google Scholar 

  • Eckardt NA (2001) Functional evolutionary genetics and plant adaptation linking phenotype and genotype. Plant Cell 13(6):1249–1254

    PubMed  CAS  Google Scholar 

  • El-Bouhssini M, Street K, Joubi A, Ibrahim Z, Rihawi F (2009) Sources of wheat resistance to Sunn pest, Eurygaster integriceps Puton, in Syria. Genet Resour Crop Evol 56(8):1065–1069

    Article  Google Scholar 

  • El-Bouhssini M, Street K, Amri A, Mackay M, Ogbonnaya FC, Omran A, Abdalla O, Baum M, Dabbous A, Rihawi F (2010) Sources of resistance in bread wheat to Russian wheat aphid (Diuraphis noxia) in Syria identified using the focused identification of germplasm strategy (FIGS). Plant Breed 130:96–97

    Article  Google Scholar 

  • Endresen DTF (2010) Predictive association between trait data and ecogeographic data for Nordic barley landraces. Crop Sci 50(6):2418–2430. doi:10.2135/cropsci2010.03.0174

    Article  Google Scholar 

  • Endresen DTF, Street K, Mackay M, Bari A, De Pauw E (2011) Predictive association between biotic stress traits and ecogeographic data for wheat and barley landraces. Crop Sci 51:2036–2055

    Article  Google Scholar 

  • Epperson BK (1990) Spatial autocorrelation of genotypes under directional selection. Genetics 124(3):757–771

    PubMed  CAS  Google Scholar 

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874. doi:10.1016/j.patrec.2005.10.010

    Article  Google Scholar 

  • Feelders AJ (1999) Statistical concepts. In: Berthold M, Hand DJ (eds) Intelligent data analysis: an Introduction. Springer, Berlin, pp 15–66

    Google Scholar 

  • Fehser S, Beike U, Stoveken J, Pretorius ZA, Van der Westhuizen A, Moersbacher B (2010) Histological and initial molecular analysis of Ug99, the new Sr31-breaking race of the wheat stem rust fungus threatening global wheat production. J Plant Pathology 92(3):709–720

    CAS  Google Scholar 

  • Freeman EA, Moisen GG (2008) A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa. Ecol Model 217:48–58

    Article  Google Scholar 

  • Gepts P (2006) Plant genetic resources conservation and utilization: the accomplishments and future of a societal insurance policy. Crop Sci 46:2278–2292

    Article  Google Scholar 

  • Gesch DB, Larson KS (1996) Techniques for development of global 1-kilometer digital elevation models. On-line document: http://edcdaac.usgs.gov/gtopo30/README.html

  • Golden RM (1996) Mathematical methods for neural network analysis and design. Massachusetts Institute of Technology, Cambridge, MA

  • Gollin D, Smale M, Skovmand B (2000) Searching an ex situ collection of wheat genetic resources. Am J Agric Econ 82(4):812–827

    Article  Google Scholar 

  • Guo Q, Kelly M, Graham CH (2004) Support vector machines for predicting distribution of Sudden oak death in California. Ecol Model 182(1):75–90

    Article  Google Scholar 

  • Hakes AS, Cronin JT (2011) Environmental heterogeneity and spatiotemporal variability in plant defense traits. Oikos 120:452–462. doi:10.1111/j.1600-0706.2010.18679.x

    Article  Google Scholar 

  • Hanspach J, Kühn I, Pompe S, Klotz S (2010) Predictive performance of plant species distribution models depends on species traits. Perspect Plant Ecol Evol Syst 12(3):219–225. doi:10.1016/j.ppees.2010.04.002

    Article  Google Scholar 

  • Hernandez PA, Graham CH, Master LL, Albert DL (2006) The effect of sample size and species characteristics on performance of different species distribution modeling methods. Ecography 29:773–785

    Article  Google Scholar 

  • Hodson D, DePauw E (2011) Use of GIS applications to combat the threat of emerging virulent wheat stem rust races. In: Sharon A (ed) GIS applications in agriculture, vol 3. Clay CRC Press, Boca Raton, pp 129–157

    Chapter  Google Scholar 

  • Hutchinson MF (1995) Interpolating mean rainfall using thin plate smoothing splines. Int J Geogr Inf Syst 9:385–403

    Article  Google Scholar 

  • Hutchinson MF (2000) ANUSPLIN version 4.1. User Guide. Center for resource and environmental studies. Australian National University, Canberra

    Google Scholar 

  • Hutchinson MF, Corbett JD (1995) Spatial interpolation of climatic data using thin plate smoothing splines. Co-ordination and harmonisation of databases and software for Agroclimatic applications, FAO Agrometeorology Series 13. FAO, Rome, pp 211–224

    Google Scholar 

  • Jeschke JM, Strayer DL (2008) Usefulness of bioclimatic models for studying climate change and invasive species. Ann N Y Acad Sci 1134:1–24

    Article  PubMed  Google Scholar 

  • Kampichler C, Wieland R, Calmé S, Weissenberger H, Arriaga-Weiss S (2010) Classification in conservation biology: a comparison of five machine-learning methods. Ecol Inform 5(6):441–450

    Article  Google Scholar 

  • Karatzoglou A, Meyer D, Hornik K (2006) Support vector machines in R. J Stat Softw 15(9)

  • Kolmer JA (2005) Tracking wheat rust on a continental scale. Curr Opin Plant Biol 8(4):441–449

    Article  PubMed  Google Scholar 

  • Koo B, Wright BD (2000) The optimal timing of evaluation of genebank accessions and the effects of biotechnology. Am J Agric Econ 82(4):797–811

    Article  Google Scholar 

  • Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5)

  • Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174

    Article  PubMed  CAS  Google Scholar 

  • Leonard KJ, Szabo LJ (2005) Stem rust of small grains and grasses caused by Puccinia graminis. Mol Plant Pathol 6:99–111

    Article  PubMed  Google Scholar 

  • MacArthur RH, Wilson EO (1967) The theory of island biogeography. Princeton University Press, Princeton

    Google Scholar 

  • Mackay MC (1990) Strategic planning for effective evaluation of plant germplasm. In: Srivastava JP, Damania AB (eds) Wheat genetic resources: meeting diverse needs. Wiley, Chichester, pp 21–25

    Google Scholar 

  • Mackay MC (1995) One core collection or many? In: Hodgkin T, Brown AHD, Van Hintum TJL, Morales AAV (eds) Core collections of plant genetic resources. Wiley, Chichester, pp 199–210

    Google Scholar 

  • Mackay MC, Street K (2004) Focused identification of germplasm strategy—FIGS. In: Black CK, Panozzo JF, Rebetzke GJ (eds) Proceedings of the 54th Australian cereal chemistry conference and the 11th wheat breeders’ assembly, pp 138–141. Royal Australian Chemical Institute, Melbourne

  • Malanson GP, Armstrongy MP (1990) Improving environmental simulation models to assess climate change impacts. University of Iowa, Department of Geography discussion paper no. 43, p 35

  • Mann S, Benwell GL (1995) Geographic information systems in environmental management, AURISA/ 7th colloquium of the Spatial Information Research Centre, pp 295–310, Palmerston North

  • McIntosh RA, Yamazaki Y, Dubcovsky J, Rogers J, Morris C, Somers DJ, Appels R, Devos KM (2008) Catalogue of gene symbols for wheat. In: Appels R, Eastwood R, Lagudah E, Langridge P, Mackay M, McIntyre L, Sharp P (eds) Proceedings of the 11th international wheat genetics symposium, Brisbane

  • McIntosh R, Dubcovsky J, Rogers W, Morris C, Appels R, Xia X (2010) Catalogue of gene symbols for wheat: 2010 supplement. http://www.shigen.nig.ac.jp/wheat/komugi/genes/macgene/supplement2010.pdf

  • Mevik BH, Wehrens R (2006) The pls package: principal component and partial least squares regression. J Stat Softw 18(2):1–24

    Google Scholar 

  • Osborne JW (2010) Improving your data transformations: applying Box–Cox transformations as a best practice. Pract Assess Res Eval 15(12):1–9

    Google Scholar 

  • Paillard S, Goldringer I, Enjalbert J, Trottet M, David J, de Vallavieille-Pope C, Brabant P (2000) Evolution of resistance against powdery mildew in winter wheat populations conducted under dynamic management. II. Adult plant resistance. Theoretical Appl Genet 101:457–462

    Article  CAS  Google Scholar 

  • Pakeman R, Leps J, Kleyer M, Lavorel S, Garnier E, VISTA consortium (2009) Relative climatic, edaphic and management controls of plant functional trait signatures. J Veg Sci 20:148–159

    Article  Google Scholar 

  • Pessoa-Filho M, Rangel PHN, Ferreira ME (2010) Extracting samples of high diversity from thematic collections of large gene banks using a genetic-distance based approach. BMC Plant Biol 10:127

    Google Scholar 

  • Pohlmann JT, Leitner DW (2003) A comparison of ordinary least squares and logistic regression. Ohio J Sci 103(5):118–125

    Google Scholar 

  • Polignano GB, Uggenti P, Scippa G (2001) Diversity analysis and core collection formation in Bari faba bean germplasm. FOA/Bioversity PGR Newsl 125:33–38

    Google Scholar 

  • Prasad AM, Iverson LR, Liaw A (2006) Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9:181–199

    Article  Google Scholar 

  • Principe JC, Euliano NR, Lefebvre WC (2000) Neural and adaptive systems: fundamentals through simulations. Wiley, New York

    Google Scholar 

  • Qualset CO (1975) Sampling germplasm in a center of diversity: an example of disease resistance in Ethiopian Barley. In: Frankel H, Hawkes JD (eds) Crop genetic resources today and tomorrow. Cambridge University Press, Cambridge, pp 81–96

    Google Scholar 

  • R Development Core Team (2011) R: a language and environment for statistical computing. R foundation for statistical computing, Vienna. ISBN: 3-900051-07-0

  • Scott JM, Heglund PJ, Morrison ML (2002) Predicting species occurrences: issues of accuracy and scale. Island Press, Covelo California

    Google Scholar 

  • Silipo R (1999) Neural networks. In: Berthold M, Hand DJ (eds) Intelligent data analysis: an Introduction. Springer, Berlin, pp 217–268

    Google Scholar 

  • Spieth PT (1979) Environmental heterogeneity: a problem of contradictory selection pressures, gene flow, and local polymorphism. Am Nat 113(2):247–260

    Article  Google Scholar 

  • Spooner DM, Jansky SH, Simon R (2009) Tests of taxonomic and biogeographic predictivity: resistance to disease and insect pests in wild relatives of cultivated potato. Crop Sci 49:1367–1376

    Article  Google Scholar 

  • Stockwell D (2007) Niche modeling: predictions from statistical distributions. Chapman and Hall, CRC. ISBN: 9781584884941

  • Street K, Mackay M, Zuev E, Kaur N, El Bouhssini M, Konopka J, Mitrofanova O (2008) Diving into the genepool: a rational system to access specific traits from large germplasm collections. In: Appels R, Eastwood R, Lagudah E, Langridge P, Mackay M (eds) Proceedings of the 11th international wheat genetics symposium, pp 28–31, Brisbane

  • Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods 14(4):323–348

    Article  PubMed  Google Scholar 

  • Swets JA, Dawes RM, Monahan J (2000) Better decisions through science. Sci Am 283:82–87

    Article  PubMed  CAS  Google Scholar 

  • Tait AB, Turner RW (2005) Generating multi-year gridded daily rainfall over. NZ J Appl Meteorol 44:1315–1323

    Article  Google Scholar 

  • Tautenhahn S, Heilmeier H, Götzenberger L, Klotz S, Wirth C, Kühn I (2008) On the biogeography of seed mass in Germany distribution patterns and environmental correlates. Ecography 31:457–468

    Article  Google Scholar 

  • Tirelli T, Pozzi L, Pessani D (2009) Use of different approaches to model presence/absence of Salmo marmoratus in Piedmont (Northwestern Italy). Ecol Inform 4:234–242

    Article  Google Scholar 

  • Tukey JW (1957) On the comparative anatomy of transformations. Ann Math Stat 28(3):602–632

    Article  Google Scholar 

  • Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York

    Google Scholar 

  • Vurro M, Bonciani B, Vannacci G (2010) Emerging infectious diseases of crop plants in developing countries: impact on agriculture and socio-economic consequences. Food Sec 2:113–132

    Article  Google Scholar 

  • Warner B, Misra M (1996) Understanding neural networks as statistical tools. Am Stat 50(4):284–293

    Google Scholar 

  • Webb CT, Hoeting JA, Ames GM, Pyne MI, LeRoy Poff N (2010) A structured and dynamic framework to advance traits-based theory and prediction in ecology. Ecol Lett 13:267–283

    Article  PubMed  Google Scholar 

  • Wold S, Ruhe A, Wold H, Dunn WJ (1984) The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM J Sci Stat Comp 5:735–743

    Article  Google Scholar 

  • Wood SN (2000) Modelling and smoothing parameter estimation with multiple quadratic penalties. J R Stat Soc (B) 62(2):413–428

    Article  Google Scholar 

  • Wratt DS, Tait A, Griffiths G, Espie P, Jessen M, Keys J, Ladd M, Lew D, Lowther W, Mitchell N, Morton J, Reid J, Reid S, Richardson A, Sansom J, Shankar U (2006) Climate for crops: integrating climate data with information about soils and crop requirements to reduce risks in agricultural decision-making. Meteorol Appl 13:305–315

    Article  Google Scholar 

  • Wu L, Bradshaw AD, Thurman DA (1975) The potential for evolution of heavy metal tolerance in plants. III. The rapid evolution of copper tolerance in Agrostis stolonifera. Heredity 34(2):165–187

    Article  Google Scholar 

  • Wu Y, Johnson GL, Gomez SM (2008) Data-driven modeling of cellular stimulation, signaling and output response in RAW 264.7 cells. J Mol Signaling 3:11. doi:10.1186/1750-2187-3-11

    Article  Google Scholar 

  • Xu Y (2010) Plant genetic resources: Management, evaluation and enhancement. In: Molecular plant breeding. CAB International, Wallingford, UK, pp 151–194

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenneth Street.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bari, A., Street, K., Mackay, M. et al. Focused identification of germplasm strategy (FIGS) detects wheat stem rust resistance linked to environmental variables. Genet Resour Crop Evol 59, 1465–1481 (2012). https://doi.org/10.1007/s10722-011-9775-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10722-011-9775-5

Keywords

Navigation