Biodiversity and Conservation

, Volume 27, Issue 6, pp 1471–1486 | Cite as

Measuring the representativeness of a germplasm collection

  • Carlos Hernandez-Suarez
Original Paper
Part of the following topical collections:
  1. Ex-situ conservation


Many germplasm collections aim to preserve most of the genetic diversity present in a population so that the population could be regenerated, which provides genetic resources to ensure food security. This paper proposes a way to measure how well a germplasm collection achieve this goal. In the most common scenario, one has little information regarding the number and statistical distribution of alleles at every locus, and it is thus very difficult to measure the representativeness of the accession. Here, we show how to use samples of allelic diversity at a sample of loci to estimate the representativeness of an accession based on the coverage of a sample with point and interval estimates. Our approach avoids making unrealistic assumptions regarding the number of loci, the bounds for the number of alleles or their frequency distributions. Depending on the sampling scheme of a collection, we differentiate between absolute or relative coverage. Here, we demonstrate this methodology using data from the germplasm collection at the Leibniz Institute of Plant Genetics and Crop Plant Research.


Coverage Allele conservation Seed accession 



The author would like to thank Dr. Marion Roder, who kindly shared the data set used in Huang et al. (2002) paper.

Author contributions

Carlos Hernandez-Suarez developed the methodology, performed the simulations, wrote the manuscript.

Compliance with ethical standards

Conflict of interest

The author declares no conflict of interest.


  1. Brown AHD (1995) The core collection at the crossroads. In: Hodgkin T, Brown AHD, van Hintum TJL, Morales EAV (eds) Core collections of plant genetic resources. Wiley, Chichester, pp 3–19Google Scholar
  2. Chao A (1981) On estimating the probability of discovering a new species. Ann Stat 9(6):1339–1342CrossRefGoogle Scholar
  3. Chao A, Lee SM (1992) Estimating the number of classes via sample coverage. J Am Stat Assoc 87(417):210–217CrossRefGoogle Scholar
  4. Chao A, Lee SM (1993) Estimating population size for continuous-time capture-recapture models via sample coverage. Biom J 35(1):29–45CrossRefGoogle Scholar
  5. Darwin C (1866) On the origin of species by means of natural selection: or the preservation of favoured races in the struggle for life. John Murray, LondonGoogle Scholar
  6. Esty WW (1982) Confidence intervals for the coverage of low coverage samples. Ann Stat 10(1):190–196CrossRefGoogle Scholar
  7. Esty WW (1983) A normal limit law for a nonparametric estimator of the coverage of a random sample. Ann Stat 11(3):905–912CrossRefGoogle Scholar
  8. Esty W (1985) Estimation of the number of classes in a population and the coverage of a sample. Math Sci 10:41–50Google Scholar
  9. Esty WW (1986) The efficiency of good’s nonparametric coverage estimator. Ann Stat 14(3):1257–1260CrossRefGoogle Scholar
  10. Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40(3–4):237–264CrossRefGoogle Scholar
  11. Good I, Toulmin G (1956) The number of new species, and the increase in population coverage, when a sample is increased. Biometrika 43(1–2):45–63CrossRefGoogle Scholar
  12. Harris B (1959) Determining bounds on integrals with applications to cataloging problems. Ann Math Stat 30(2):521–548CrossRefGoogle Scholar
  13. Huang SP, Weir B (2001) Estimating the total number of alleles using a sample coverage method. Genetics 159(3):1365–1373PubMedPubMedCentralGoogle Scholar
  14. Huang X, Börner A, Röder M, Ganal M (2002) Assessing genetic diversity of wheat (triticum aestivum l.) germplasm using microsatellite markers. Theor Appl Genet 105(5):699–707CrossRefPubMedGoogle Scholar
  15. Knott M (1967) Models for cataloguing problems. Ann Math Stat 38(4):1255–1260CrossRefGoogle Scholar
  16. Lee SM, Chao A (1994) Estimating population size via sample coverage for closed capture-recapture models. Biometrics 50(1):88–97CrossRefPubMedGoogle Scholar
  17. Lo SH (1992) From the species problem to a general coverage problem via a new interpretation. Ann Stat 20(2):1094–1109CrossRefGoogle Scholar
  18. Nei M (1973) Analysis of gene diversity in subdivided populations. Proc Natl Acad Sci 70(12):3321–3323CrossRefPubMedPubMedCentralGoogle Scholar
  19. Robbins HE (1968) Estimating the total probability of the unobserved outcomes of an experiment. Ann Math Stat 39(1):256–257CrossRefGoogle Scholar
  20. Starr N (1979) Linear estimation of the probability of discovering a new species. Ann Stat 7(3):644–652CrossRefGoogle Scholar
  21. van Hintum TJ, Brown AHD, Spillane C, Hodkin T (2000) Core collections of plant genetic resources (IPGRI Technical Bulletin No. 3., Rome, Italy, 2000)Google Scholar
  22. Zhang C-H, Zhang Z (2009) Asymptotic normality of a nonparametric estimator of sample coverage. Ann Stat 37:2582–2595CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Facultad de CienciasUniversidad de ColimaColimaMéxico

Personalised recommendations