Community Ecology

, Volume 14, Issue 1, pp 121–132 | Cite as

The classification conundrum: species fidelity as leading criterion in search of a rigorous method to classify a complex forest data set

  • M. C. LötterEmail author
  • L. Mucina
  • E. T. F. Witkowski


We present a test involving a large number of data-analytical techniques to identify a rigorous numerical classification method optimising on statistically identified faithful species. The test follows a stepwise filtering process involving various numerical-classification tools. Five steps were involved in the testing: (1) evaluation of 322 classification tools using Optim-Class 1; (2) comparison of 20 best performing methods by standardising the various performances across a range of fidelity values using OptimClass 1 and OptimClass 2, to assess the effectiveness of the agglomerative clustering and one divisive technique; (3) calculation and comparison of Uniqueness values and ISAMIC (Indicator Species Analysis Minimising Intermediate Constancies) scores of the resulting classifications; (4) comparison of different classifications by analysing the similarities of the resulting synoptic tables using faithful species, assuming that clusters with similar faithful species represent corresponding vegetation types, and (5) final selection of the single best method based on an expert review of non-geometric internal evaluators, NMDS ordinations and mapped classification solutions. A complex data set, representing many forest vegetation types and consisting of 506 relevés of 20 m x 20 m sampled in the indigenous forests of Mpumalanga Province (South Africa), was tested. Analysis of Uniqueness provided insight into which methods produced classifications that did not share faithful species. The analysis of synoptic table similarity showed that the classification results were at most 88% similar, while in the most divergent case similarity of only 50% was achieved. OptimClass eliminated poorly performing numerical-classification combinations and highlighted the best performing methods. Yet it was unable to reveal the single best performing method unequivocally across the range of fidelity values used. In such cases, we suggest the solution can be sought in relying on involving external data through expert opinion. Ordinal Clustering and TWINSPAN produced the most outlying classification results. Flexible beta clustering (β = -0.25) in combination with Bray-Curtis coefficient, standardised by sample unit totals, produced the most informative result for our data set when using informal expert-defined ecological and biogeographical judgement criteria. We recommend that the performance of a set of methods be tested prior to selecting the final classification approach.


Cluster analysis Fidelity JUICE software Resemblance Vegetation classification 



Indicator Species Analysis Minimizing Intermediate Constancies


Geographical Information Systems


Non-metric Multidimensional Scaling


Principal Coordinates Analysis


Two-way Indicator Species Analysis


Unweighted Pair-Group Method using Arithmetic Averages


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

42974_2013_14010121_MOESM1_ESM.pdf (442 kb)
Supplementary material, approximately 452 KB.


  1. Aho, K., D.W. Roberts and T. Weaver. 2008. Using geometric and non-geometric internal evaluators to compare eight vegetation classification methods. J. Veg. Sci. 19: 549–562.Google Scholar
  2. Anderson, M.J., T.O. Crist, J.M. Chase, M. Vellend, B.D. Inouye, A.L. Freestone, N.J. Sanders, H.V. Cornell, L.S. Comita, K.F. Davies, S.P. Harrison, N.J.B. Kraft, J.C. Stegen and N.G. Swen-son. 2010. Navigating the multiple meanings of â diversity: a roadmap for the practicing ecologist. Ecol. Lett. 14: 19–28.PubMedGoogle Scholar
  3. Belbin, L. and C. McDonald. 1993. Comparing three classification strategies for use in ecology. J. Veg. Sci. 4: 341–348.Google Scholar
  4. Belbin, L. 1993. PATN Pattern Analysis Package. Users Guide. Division of Wildlife and Ecology, CSIRO.Google Scholar
  5. Campbell, B.M., 1978. Similarity coefficients for classifying relevés. Vegetatio 37: 101–109.Google Scholar
  6. Chase, J.M., A.A. Burgett and e.g., Biro. 2010. Habitat isolation moderates the strength 4 of top-down control in experimental pond food webs. Ecology 91: 637–643.PubMedGoogle Scholar
  7. Chytrý, M., L. Tichý, J. Holt and Z. Botta-Dukát. 2002. Determination of diagnostic species with statistical fidelity measures. J. Veg. Sci. 13: 79–90.Google Scholar
  8. Chytrý, M. and L. Tichý. 2003. Diagnostic, constant and dominant species of vegetation classes and alliances of the Czech Republic: a statistical revision. Folia Facultatis Scientiarum Natu-ralium Universitatis Masarykianae Brunensis 108: 1–231.Google Scholar
  9. Clarke, K.R. and R.M. Warwick. 1994. Change in Marine Communities: An Approach to Statistical Analysis and Interpretation. Plymouth Marine Laboratory, Plymouth.Google Scholar
  10. Dale, M.B. 1995. Evaluating classification strategies. J. Veg. Sci. 6: 437–440.Google Scholar
  11. D’Souza, L.E. and P.W. Barnes. 2008. Woody plant effects on soil seed banks in a central Texas savanna. Southwestern Naturalist 53: 495–506.Google Scholar
  12. ESRI. 2002. ArcView 3.3. Environmental Systems Research Institute, Redlands, CA.Google Scholar
  13. Faith, D.P., P.R. Minchin and L. Belbin. 1987. Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69: 57–68.Google Scholar
  14. Feoli, E. and D. Lausi. 1980. Hierarchical levels in syntaxonomy based on information functions. Vegetatio 42: 113–115.Google Scholar
  15. Jüriado, I., J. Liira, D. Csencsics, I. Widmer, C. Adolf, K. Kohv and C. Scheidegger. 2011. Dispersal ecology of the endangered woodland lichen Lobaria pulmonaria in managed hemiboreal forest landscape. Biodivers. Conserv. 20: 1803–1819.Google Scholar
  16. Gauch Jr., H.G. and R.H. Whittaker. 1981. Hierarchical classification of community data. J. Ecol. 69: 537–557.Google Scholar
  17. Hennekens, S.M. and J.H.J. Schaminée. 2001. TURBOVEG, a comprehensive database management system for vegetation data. J. Veg. Sci. 12: 589–591.Google Scholar
  18. Hill, M.O. 1979. TWINSPAN – a FORTRAN Program for Arranging Multivariate Data in an Ordered Two-way Table by Classification of the Individuals and Attributes. Ecology & Systematics, Cornell University, Ithaca, NY.Google Scholar
  19. Hill, M.O. and H.G. Gauch Jr. 1980. Detrended correspondence analysis, an improved ordination technique. Vegetatio 42: 47–58.Google Scholar
  20. Hogeweg, P. 1976. Iterative character weighting in numerical taxonomy. Computers in Biology and Medicine 6: 199–211.PubMedGoogle Scholar
  21. Huhta, V. 1979. Evaluation of different similarity indices as measures of succession in arthropod communities of the forest floor after clear-cutting. Oecologia 41: 11–23.PubMedGoogle Scholar
  22. Jongman, R.H.G., C.J.F. ter Braak and O.F.R. van Tongeren. 1995. Data Analysis in Community and Landscape Ecology. Cambridge University Press, Cambridge.Google Scholar
  23. Kent, M. and Coker, P. 1994. Vegetation Description and Analysis – A Practical Approach. Wiley, Chichester.Google Scholar
  24. Kindt, R. and R. Coe. 2005. Tree Diversity Analysis. A Manual and Software for Common Statistical Methods for Ecological and Biodiversity Studies. World Agroforestry Centre (CRAF), Nairobi.Google Scholar
  25. Knollová, I., M. Chytrý, L. Tichý and O. Hájek. 2005. Stratified resampling of phytosociological databases: some strategies for obtaining more representative data sets for classification studies. J. Veg. Sci. 16: 479–486.Google Scholar
  26. Legendre, P. and L. Legendre. 1998. Numerical Ecology, second ed. Elsevier, Amsterdam.Google Scholar
  27. Lepš, J. and P. Šmilauer. 2003. Multivariate Analysis of Ecological Data Using CANOCO. Cambridge University Press, Cambridge.Google Scholar
  28. Lötter, M.C., A.J. Emery, and S.D. Williamson. 2002. Forests. In: Emery, A.J., M.C. Lötter, and S.D. Williamson (eds.), Determining the Conservation Value of Land in Mpumalanga. Mpu-malanga Parks Board, Nelspruit, pp. 28–34.Google Scholar
  29. McCune, B. and J.B. Grace. 2002. Analysis of Ecological Communities. MjM Software, Gleneden Beach, OR.Google Scholar
  30. McCune, B. and M.J. Mefford. 2006. PC-ORD. Multivariate Analysis of Ecological Data. Version 5.20. MjM Software, Gleneden Beach, OR.Google Scholar
  31. Minchin, P.R. 1987. An evaluation of the relative robustness of techniques for ecological ordination. Vegetatio 69: 89–107.Google Scholar
  32. Mlambo, M.C., M.S. Bird, C.C. Reed, and J.A. Day. 2011. Diversity patterns of temporary wetland macroinvertebrate assemblages in the south-western Cape, South Africa. Afr. J. Aquatic Sci. 36: 299–308.Google Scholar
  33. Mucina, L. 1997. Classification of vegetation: past, present and future. J. Veg. Sci. 8: 751–760.Google Scholar
  34. Mucina, L. and E. van der Maarel. 1989. Twenty years of numerical syntaxonomy. Vegetatio 81: 1–15.Google Scholar
  35. Mucina, L. and M. Hauser. 1993. A new method for determining optimal number of clusters in vegetation data. Abstracta Bo-tanica 17: 147–153.Google Scholar
  36. Mucina, L., E. Pienaar, A. van Niekerk, M. Lötter, C.R. Scott-Shaw, M. Meets, L. Seoke, T. Sekome, S.J. Siebert, L. Loffler, S.G. Cawe, A.P. Dold, A. Abbott, J. Kalwij and L. Tichý. 2007. Habitat-level Classification of the Albany Coastal, Pondoland Scarp and Eastern Scarp Forests. Unpublished Report for DWAF. Stellenbosch University, Matieland, ZA.Google Scholar
  37. Mueller-Dombois, D. and H. Ellenberg. 1974. Aims and Methods of Vegetation Ecology. Wiley, New York.Google Scholar
  38. Noy-Meir, I., D. Walker and W.T. Williams. 1975. Data transformations in ecological ordination. J. Ecol. 63: 779–800.Google Scholar
  39. Oksanen, J. and T. Tonteri. 1995. Rate of compositional turnover along gradients and total gradient length. J. Veg. Sci. 6: 8 15–824.Google Scholar
  40. Podani, J. 1998. Explanatory variables in classifications and the detection of the optimum number of clusters. In: Hayashi, C., N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock and Y. Baba, (eds.), Data Science, Classification and Related Methods. Springer, Tokyo, pp. 125–132.Google Scholar
  41. Podani, J. 2000. Introduction to the Exploration of Multivariate Biological Data. Backhuys Publishers, Leiden, NL.Google Scholar
  42. Podani, J. 2001. Computer Programs for Data Analysis in Ecology and Systematics. User’s Manual. Scientia, Budapest.Google Scholar
  43. Podani, J. 2005. Multivariate exploratory analysis of ordinal data in ecology: pitfalls, problems and solutions. J. Veg. Sci. 16: 497–510.Google Scholar
  44. Podani, J. 2006. Braun-Blanquet’s legacy and data analysis in vegetation science. J. Veg. Sci. 17: 113–117.Google Scholar
  45. Popma, J., L. Mucina, O. van Tongeren and E. van der Maarel. 1983. On the determination of optimal levels in phytosociological classification. Vegetatio 52: 65–75.Google Scholar
  46. Redman, C.M. and L.R. Leighton. 2009. Multivariate faunal analyses of the Turonian Bissekty Formation: variation in the degree of marine influence in temporally and spatially averaged fossil assemblages. Palaios 24: 18–26.Google Scholar
  47. Roberts, D.W. 2010. Labdsv: ordination and multivariate analysis for ecology. R package version 1.4–1.; accessed on 15 March 2010Google Scholar
  48. Roleček, J., L. Tichý, D. Zelený and M. Chytrý. 2009. Modified TWINSPAN classification in which the hierarchy respects cluster heterogeneity. J. Veg. Sci. 20: 596–602.Google Scholar
  49. Schulze, R.E. 1997. South African Atlas for Agrohydrology and Climatology. Water Research Commission, Pretoria, Report TT82/96.Google Scholar
  50. Schmidtlein, S., L. Tichý, H. Feilhauer and U. Faude. 2010. A brute-force approach to vegetation classification. J. Veg. Sci. 21: 1162–1171.Google Scholar
  51. Tamás, J., J. Podani and P. Csontos. 2001. An extension of presence/ absence coefficients to abundance data: a new look at absence. J. Veg. Sci. 12: 401–410.Google Scholar
  52. Tichý, L. 2002. JUICE, software for vegetation classification. J. Veg. Sci. 13: 451–453.Google Scholar
  53. Tichý, L. and J. Holt. 2006. JUICE, Program for Management, Analysis and Classification of Ecological Data. User’s Manual. Masaryk University, Brno.Google Scholar
  54. Tichý, L., M. Chytrý, M. Hájek, S. Talbot and Z. Botta-Dukát. 2009. OptimClass: Using species-to-cluster fidelity to determine the optimal partition in classification of ecological communities. J. Veg. Sci. 21: 287–299.Google Scholar
  55. van der Maarel, E. 1979. Transformation of cover-abundance values in phytosociology and its effects on community similarity. Vegetatio 39: 97–114.Google Scholar
  56. van der Maarel, E. 2007. Transformation of cover-abundance values for appropriate numerical treatment – alternatives to the proposals by Podani. J. Veg. Sci. 8: 767–770.Google Scholar
  57. van Groenewoud, H. 1992. The robustness of Correspondence, De-trended Correspondence, and TWINSPAN Analysis. J. Veg. Sci. 3: 239–246.Google Scholar
  58. Wildi, O. 2010. Data Analysis in Vegetation Ecology. Wiley, Chich-ester, UK.Google Scholar
  59. Williams, N.E. 2010. Restoration of nontarget species: bee communities and pollination function in riparian forests. Restor. Ecol. 19: 450–459.Google Scholar
  60. Wolda, H. 1981. Similarity indices, sample size and diversity. Oe-cologia 50: 296–302.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest 2013

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors and Affiliations

  • M. C. Lötter
    • 1
    • 2
    Email author
  • L. Mucina
    • 3
  • E. T. F. Witkowski
    • 1
  1. 1.Restoration and Conservation Biology, School of Animal, Plant and Environmental SciencesUniversity of the WitwatersrandJohannesburgSouth Africa
  2. 2.Mpumalanga Tourism and Parks AgencyLydenburgSouth Africa
  3. 3.School of Plant Biology, M084The University of Western AustraliaCrawleyAustralia

Personalised recommendations