Community Ecology

, Volume 19, Issue 3, pp 311–318 | Cite as

Bias in estimates of the classic and incidence-based Jaccard similarity indices: insights from assemblage simulation



Similarity indices are often used for measuring b-diversity and as the starting point of multivariate analysis. In this study, I used simulation to examine the direction and amount of bias in estimates of two similarity indices, Jaccard Coefficient (J) and incidence-based J (J). I design a novel simulation to generate three sets of assemblages that vary in species richness, species-occurrence distributions, and b-diversity. I characterized assemblage differences with the ratio of [proportion of rare species in all shared species / proportion of rare species in all unshared species] (i.e., PRss/PRus) and the Pearson’s correlation in the probabilities of shared species between two assemblages (i.e., share-species correlation). I found that J was subject to strong positive or negative bias, depending on PRss/PRus. J was mainly subject to negative bias, which varied with share-species correlation. In both indices, bias varied substantially from one pair of assemblages to another and among datasets. The high variation in the bias across different comparisons of assemblages may compromise b-diversity estimation established at low sampling efforts based on the two indices or their variants.


Assemblage simulation Beta-diversity Estimating assemblage similarity Under-sampling 



the classic Jaccard Coefficient


the incidence-based Jaccard Coefficient adjusted for unseen species


the Number of Shared Species by two assemblages


occurrence probability of Species j at a random sample unit in Assemblage i,


the Proportion of Rare species out of all Shared Species by two assemblages


the Proportion of Rare species out of all Unshared Species by two assemblages


Species-Occurrence Distribution – a plot of relative occurrence frequency of species against their ranks (from common to rare)


the Total number of species in a pair of assemblages


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

42974_2018_19030311_MOESM1_ESM.pdf (48 kb)
Supplementary material, approximately 49 KB.


  1. Baselga, A. 2010. Partitioning the turnover and nestedness components of beta diversity. Glob. Ecol. Biogeogr. 19:134–143.CrossRefGoogle Scholar
  2. Cao, Y. and J. Epifanio. 2010. Quantifying the responses of macroinvertebrate assemblages to simulated stress: are more accurate similarity indices less useful? Methods Ecol. Evol. 1:380–388.Google Scholar
  3. Cao, Y., C.P. Hawkins, D.P. Larsen and J. Van Sickle. 2007. Effects of sample standardization on mean species detectabilities and estimates of relative differences in species richness among assemblages. Am. Nat. 170:381–385.CrossRefGoogle Scholar
  4. Cao, Y., D.P. Larsen, R.M. Hughes, P. Angermeier and T. Patton. 2002. Sampling efforts affect multivariate comparisons of stream assemblages. J. N. Am. Benthol. Soc. 21:707–714.CrossRefGoogle Scholar
  5. Cardoso, P., P.A.V. Borges and J.V. Veech. 2009. Testing the performance of beta diversity measures based on incidence data: the robustness to undersampling. Divers. Distrib. 15:1081–1090.CrossRefGoogle Scholar
  6. Carvalho, J.C., P. Cardoso and P. Gomes. 2012. Determining the relative roles of species replacement and species richness differences in generating beta-diversity. Glob. Ecol. Biogeogr. 21:760–771.CrossRefGoogle Scholar
  7. Chao, A., R.L. Chazdon, R.K. Colwell and T.J. Shen. 2005. A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecol. Lett. 8:148–159.CrossRefGoogle Scholar
  8. Chao, A., R.L. Chazdon, R.K. Colwell and T.J. Shen. 2006. Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics 62:361–371.CrossRefGoogle Scholar
  9. Chao, A., W. Hwang, Y.C. Chen and C.Y. Kuo. 2000. Estimating the number of shared species in two communities. Statistica Sinica 10:227–246.Google Scholar
  10. Condit, R., R. Perez, S. Lao, S. Aguilar and A. Somoza. 2005. Geographic ranges and β-diversity: discovering how many tree species there are where. Biologiske Skrifter Kongelige Danske Videnskabernes Selskab. 55:57–71.Google Scholar
  11. Engen, S., V. Grøtan and B-E. Sæther. 2011. Estimating similarity of communities: a parametric approach to spatial-temporal analysis of species diversity. Ecography 34:220–231.CrossRefGoogle Scholar
  12. Faith, D.O., P.R. Minchin and L. Belbin. 1987. Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69:53–68.CrossRefGoogle Scholar
  13. Holtrop, A.M., Y. Cao and C.R. Dolan. 2010. Estimating sampling effort required for characterizing species richness and site-to-site similarity in fish assemblage surveys of wadeable Illinois streams. T. A. Fish. Soc. 139:1421–1435.CrossRefGoogle Scholar
  14. Legendre, P., D. Borcard and P.R. Peres-Neto. 2005. Analyzing beta diversity: partitioning the spatial variation of community composition data. Ecol. Monogr. 75:435–450.CrossRefGoogle Scholar
  15. Legendre, P. and L. Legendre. 2012. Numerical Ecology. 3rd Edition, Elsevier, New York.Google Scholar
  16. Pan, H.Y., A. Chao and W. Foissner. 2009. A non-parametric lower bound for the number of species shared by multiple communities. J. Arg. Biol. Envir. St. 14:452–468.CrossRefGoogle Scholar
  17. Smith, W., A.R. Solow, and P.E. Preston. 1996. An estimator of species overlap using a modified beta-binomial model. Biometrics 52:1472–1477.CrossRefGoogle Scholar
  18. Steinitz, O., J. Heller and A. Tsoar. 2005. Predicting regional patterns of similarity in species composition for conservation planning. Conserv. Biol. 19:1978–1988.CrossRefGoogle Scholar
  19. Su, J.C., D.M. Debinski, M.E. Jakubauskas and K. Kindscher. 2004. Beyond species richness: Community similarity as a measure of cross-taxon congruence for coarse-filter conservation. Conserv. Biol. 18:167–173.CrossRefGoogle Scholar
  20. Yue, J., M.K. Clayton and F.C. Lin. 2001. A nonparametric estimator of species overlap. Biometrics 57:743–749.CrossRefGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest 2018

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  1. 1.Illinois Natural History Survey, Prairie Research InstituteUniversity of Illinois at Urbana-ChampaignChampaignUSA

Personalised recommendations