Similarity indices are often used for measuring b-diversity and as the starting point of multivariate analysis. In this study, I used simulation to examine the direction and amount of bias in estimates of two similarity indices, Jaccard Coefficient (J) and incidence-based J (J). I design a novel simulation to generate three sets of assemblages that vary in species richness, species-occurrence distributions, and b-diversity. I characterized assemblage differences with the ratio of [proportion of rare species in all shared species / proportion of rare species in all unshared species] (i.e., PRss/PRus) and the Pearson’s correlation in the probabilities of shared species between two assemblages (i.e., share-species correlation). I found that J was subject to strong positive or negative bias, depending on PRss/PRus. J was mainly subject to negative bias, which varied with share-species correlation. In both indices, bias varied substantially from one pair of assemblages to another and among datasets. The high variation in the bias across different comparisons of assemblages may compromise b-diversity estimation established at low sampling efforts based on the two indices or their variants.
the classic Jaccard Coefficient
the incidence-based Jaccard Coefficient adjusted for unseen species
the Number of Shared Species by two assemblages
occurrence probability of Species j at a random sample unit in Assemblage i,
the Proportion of Rare species out of all Shared Species by two assemblages
the Proportion of Rare species out of all Unshared Species by two assemblages
Species-Occurrence Distribution – a plot of relative occurrence frequency of species against their ranks (from common to rare)
the Total number of species in a pair of assemblages
Baselga, A. 2010. Partitioning the turnover and nestedness components of beta diversity. Glob. Ecol. Biogeogr. 19:134–143.
Cao, Y. and J. Epifanio. 2010. Quantifying the responses of macroinvertebrate assemblages to simulated stress: are more accurate similarity indices less useful? Methods Ecol. Evol. 1:380–388.
Cao, Y., C.P. Hawkins, D.P. Larsen and J. Van Sickle. 2007. Effects of sample standardization on mean species detectabilities and estimates of relative differences in species richness among assemblages. Am. Nat. 170:381–385.
Cao, Y., D.P. Larsen, R.M. Hughes, P. Angermeier and T. Patton. 2002. Sampling efforts affect multivariate comparisons of stream assemblages. J. N. Am. Benthol. Soc. 21:707–714.
Cardoso, P., P.A.V. Borges and J.V. Veech. 2009. Testing the performance of beta diversity measures based on incidence data: the robustness to undersampling. Divers. Distrib. 15:1081–1090.
Carvalho, J.C., P. Cardoso and P. Gomes. 2012. Determining the relative roles of species replacement and species richness differences in generating beta-diversity. Glob. Ecol. Biogeogr. 21:760–771.
Chao, A., R.L. Chazdon, R.K. Colwell and T.J. Shen. 2005. A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecol. Lett. 8:148–159.
Chao, A., R.L. Chazdon, R.K. Colwell and T.J. Shen. 2006. Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics 62:361–371.
Chao, A., W. Hwang, Y.C. Chen and C.Y. Kuo. 2000. Estimating the number of shared species in two communities. Statistica Sinica 10:227–246.
Condit, R., R. Perez, S. Lao, S. Aguilar and A. Somoza. 2005. Geographic ranges and β-diversity: discovering how many tree species there are where. Biologiske Skrifter Kongelige Danske Videnskabernes Selskab. 55:57–71.
Engen, S., V. Grøtan and B-E. Sæther. 2011. Estimating similarity of communities: a parametric approach to spatial-temporal analysis of species diversity. Ecography 34:220–231.
Faith, D.O., P.R. Minchin and L. Belbin. 1987. Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69:53–68.
Holtrop, A.M., Y. Cao and C.R. Dolan. 2010. Estimating sampling effort required for characterizing species richness and site-to-site similarity in fish assemblage surveys of wadeable Illinois streams. T. A. Fish. Soc. 139:1421–1435.
Legendre, P., D. Borcard and P.R. Peres-Neto. 2005. Analyzing beta diversity: partitioning the spatial variation of community composition data. Ecol. Monogr. 75:435–450.
Legendre, P. and L. Legendre. 2012. Numerical Ecology. 3rd Edition, Elsevier, New York.
Pan, H.Y., A. Chao and W. Foissner. 2009. A non-parametric lower bound for the number of species shared by multiple communities. J. Arg. Biol. Envir. St. 14:452–468.
Smith, W., A.R. Solow, and P.E. Preston. 1996. An estimator of species overlap using a modified beta-binomial model. Biometrics 52:1472–1477.
Steinitz, O., J. Heller and A. Tsoar. 2005. Predicting regional patterns of similarity in species composition for conservation planning. Conserv. Biol. 19:1978–1988.
Su, J.C., D.M. Debinski, M.E. Jakubauskas and K. Kindscher. 2004. Beyond species richness: Community similarity as a measure of cross-taxon congruence for coarse-filter conservation. Conserv. Biol. 18:167–173.
Yue, J., M.K. Clayton and F.C. Lin. 2001. A nonparametric estimator of species overlap. Biometrics 57:743–749.
Electronic supplementary material
About this article
Cite this article
Cao, Y. Bias in estimates of the classic and incidence-based Jaccard similarity indices: insights from assemblage simulation. COMMUNITY ECOLOGY 19, 311–318 (2018). https://doi.org/10.1556/168.2018.19.3.12
- Assemblage simulation
- Estimating assemblage similarity