We present a test involving a large number of data-analytical techniques to identify a rigorous numerical classification method optimising on statistically identified faithful species. The test follows a stepwise filtering process involving various numerical-classification tools. Five steps were involved in the testing: (1) evaluation of 322 classification tools using Optim-Class 1; (2) comparison of 20 best performing methods by standardising the various performances across a range of fidelity values using OptimClass 1 and OptimClass 2, to assess the effectiveness of the agglomerative clustering and one divisive technique; (3) calculation and comparison of Uniqueness values and ISAMIC (Indicator Species Analysis Minimising Intermediate Constancies) scores of the resulting classifications; (4) comparison of different classifications by analysing the similarities of the resulting synoptic tables using faithful species, assuming that clusters with similar faithful species represent corresponding vegetation types, and (5) final selection of the single best method based on an expert review of non-geometric internal evaluators, NMDS ordinations and mapped classification solutions. A complex data set, representing many forest vegetation types and consisting of 506 relevés of 20 m x 20 m sampled in the indigenous forests of Mpumalanga Province (South Africa), was tested. Analysis of Uniqueness provided insight into which methods produced classifications that did not share faithful species. The analysis of synoptic table similarity showed that the classification results were at most 88% similar, while in the most divergent case similarity of only 50% was achieved. OptimClass eliminated poorly performing numerical-classification combinations and highlighted the best performing methods. Yet it was unable to reveal the single best performing method unequivocally across the range of fidelity values used. In such cases, we suggest the solution can be sought in relying on involving external data through expert opinion. Ordinal Clustering and TWINSPAN produced the most outlying classification results. Flexible beta clustering (β = -0.25) in combination with Bray-Curtis coefficient, standardised by sample unit totals, produced the most informative result for our data set when using informal expert-defined ecological and biogeographical judgement criteria. We recommend that the performance of a set of methods be tested prior to selecting the final classification approach.
Indicator Species Analysis Minimizing Intermediate Constancies
Geographical Information Systems
Non-metric Multidimensional Scaling
Principal Coordinates Analysis
Two-way Indicator Species Analysis
Unweighted Pair-Group Method using Arithmetic Averages
Aho, K., D.W. Roberts and T. Weaver. 2008. Using geometric and non-geometric internal evaluators to compare eight vegetation classification methods. J. Veg. Sci. 19: 549–562.
Anderson, M.J., T.O. Crist, J.M. Chase, M. Vellend, B.D. Inouye, A.L. Freestone, N.J. Sanders, H.V. Cornell, L.S. Comita, K.F. Davies, S.P. Harrison, N.J.B. Kraft, J.C. Stegen and N.G. Swen-son. 2010. Navigating the multiple meanings of â diversity: a roadmap for the practicing ecologist. Ecol. Lett. 14: 19–28.
Belbin, L. and C. McDonald. 1993. Comparing three classification strategies for use in ecology. J. Veg. Sci. 4: 341–348.
Belbin, L. 1993. PATN Pattern Analysis Package. Users Guide. Division of Wildlife and Ecology, CSIRO.
Campbell, B.M., 1978. Similarity coefficients for classifying relevés. Vegetatio 37: 101–109.
Chase, J.M., A.A. Burgett and e.g., Biro. 2010. Habitat isolation moderates the strength 4 of top-down control in experimental pond food webs. Ecology 91: 637–643.
Chytrý, M., L. Tichý, J. Holt and Z. Botta-Dukát. 2002. Determination of diagnostic species with statistical fidelity measures. J. Veg. Sci. 13: 79–90.
Chytrý, M. and L. Tichý. 2003. Diagnostic, constant and dominant species of vegetation classes and alliances of the Czech Republic: a statistical revision. Folia Facultatis Scientiarum Natu-ralium Universitatis Masarykianae Brunensis 108: 1–231.
Clarke, K.R. and R.M. Warwick. 1994. Change in Marine Communities: An Approach to Statistical Analysis and Interpretation. Plymouth Marine Laboratory, Plymouth.
Dale, M.B. 1995. Evaluating classification strategies. J. Veg. Sci. 6: 437–440.
D’Souza, L.E. and P.W. Barnes. 2008. Woody plant effects on soil seed banks in a central Texas savanna. Southwestern Naturalist 53: 495–506.
ESRI. 2002. ArcView 3.3. Environmental Systems Research Institute, Redlands, CA.
Faith, D.P., P.R. Minchin and L. Belbin. 1987. Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69: 57–68.
Feoli, E. and D. Lausi. 1980. Hierarchical levels in syntaxonomy based on information functions. Vegetatio 42: 113–115.
Jüriado, I., J. Liira, D. Csencsics, I. Widmer, C. Adolf, K. Kohv and C. Scheidegger. 2011. Dispersal ecology of the endangered woodland lichen Lobaria pulmonaria in managed hemiboreal forest landscape. Biodivers. Conserv. 20: 1803–1819.
Gauch Jr., H.G. and R.H. Whittaker. 1981. Hierarchical classification of community data. J. Ecol. 69: 537–557.
Hennekens, S.M. and J.H.J. Schaminée. 2001. TURBOVEG, a comprehensive database management system for vegetation data. J. Veg. Sci. 12: 589–591.
Hill, M.O. 1979. TWINSPAN – a FORTRAN Program for Arranging Multivariate Data in an Ordered Two-way Table by Classification of the Individuals and Attributes. Ecology & Systematics, Cornell University, Ithaca, NY.
Hill, M.O. and H.G. Gauch Jr. 1980. Detrended correspondence analysis, an improved ordination technique. Vegetatio 42: 47–58.
Hogeweg, P. 1976. Iterative character weighting in numerical taxonomy. Computers in Biology and Medicine 6: 199–211.
Huhta, V. 1979. Evaluation of different similarity indices as measures of succession in arthropod communities of the forest floor after clear-cutting. Oecologia 41: 11–23.
Jongman, R.H.G., C.J.F. ter Braak and O.F.R. van Tongeren. 1995. Data Analysis in Community and Landscape Ecology. Cambridge University Press, Cambridge.
Kent, M. and Coker, P. 1994. Vegetation Description and Analysis – A Practical Approach. Wiley, Chichester.
Kindt, R. and R. Coe. 2005. Tree Diversity Analysis. A Manual and Software for Common Statistical Methods for Ecological and Biodiversity Studies. World Agroforestry Centre (CRAF), Nairobi.
Knollová, I., M. Chytrý, L. Tichý and O. Hájek. 2005. Stratified resampling of phytosociological databases: some strategies for obtaining more representative data sets for classification studies. J. Veg. Sci. 16: 479–486.
Legendre, P. and L. Legendre. 1998. Numerical Ecology, second ed. Elsevier, Amsterdam.
Lepš, J. and P. Šmilauer. 2003. Multivariate Analysis of Ecological Data Using CANOCO. Cambridge University Press, Cambridge.
Lötter, M.C., A.J. Emery, and S.D. Williamson. 2002. Forests. In: Emery, A.J., M.C. Lötter, and S.D. Williamson (eds.), Determining the Conservation Value of Land in Mpumalanga. Mpu-malanga Parks Board, Nelspruit, pp. 28–34.
McCune, B. and J.B. Grace. 2002. Analysis of Ecological Communities. MjM Software, Gleneden Beach, OR.
McCune, B. and M.J. Mefford. 2006. PC-ORD. Multivariate Analysis of Ecological Data. Version 5.20. MjM Software, Gleneden Beach, OR.
Minchin, P.R. 1987. An evaluation of the relative robustness of techniques for ecological ordination. Vegetatio 69: 89–107.
Mlambo, M.C., M.S. Bird, C.C. Reed, and J.A. Day. 2011. Diversity patterns of temporary wetland macroinvertebrate assemblages in the south-western Cape, South Africa. Afr. J. Aquatic Sci. 36: 299–308.
Mucina, L. 1997. Classification of vegetation: past, present and future. J. Veg. Sci. 8: 751–760.
Mucina, L. and E. van der Maarel. 1989. Twenty years of numerical syntaxonomy. Vegetatio 81: 1–15.
Mucina, L. and M. Hauser. 1993. A new method for determining optimal number of clusters in vegetation data. Abstracta Bo-tanica 17: 147–153.
Mucina, L., E. Pienaar, A. van Niekerk, M. Lötter, C.R. Scott-Shaw, M. Meets, L. Seoke, T. Sekome, S.J. Siebert, L. Loffler, S.G. Cawe, A.P. Dold, A. Abbott, J. Kalwij and L. Tichý. 2007. Habitat-level Classification of the Albany Coastal, Pondoland Scarp and Eastern Scarp Forests. Unpublished Report for DWAF. Stellenbosch University, Matieland, ZA.
Mueller-Dombois, D. and H. Ellenberg. 1974. Aims and Methods of Vegetation Ecology. Wiley, New York.
Noy-Meir, I., D. Walker and W.T. Williams. 1975. Data transformations in ecological ordination. J. Ecol. 63: 779–800.
Oksanen, J. and T. Tonteri. 1995. Rate of compositional turnover along gradients and total gradient length. J. Veg. Sci. 6: 8 15–824.
Podani, J. 1998. Explanatory variables in classifications and the detection of the optimum number of clusters. In: Hayashi, C., N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock and Y. Baba, (eds.), Data Science, Classification and Related Methods. Springer, Tokyo, pp. 125–132.
Podani, J. 2000. Introduction to the Exploration of Multivariate Biological Data. Backhuys Publishers, Leiden, NL.
Podani, J. 2001. Computer Programs for Data Analysis in Ecology and Systematics. User’s Manual. Scientia, Budapest.
Podani, J. 2005. Multivariate exploratory analysis of ordinal data in ecology: pitfalls, problems and solutions. J. Veg. Sci. 16: 497–510.
Podani, J. 2006. Braun-Blanquet’s legacy and data analysis in vegetation science. J. Veg. Sci. 17: 113–117.
Popma, J., L. Mucina, O. van Tongeren and E. van der Maarel. 1983. On the determination of optimal levels in phytosociological classification. Vegetatio 52: 65–75.
Redman, C.M. and L.R. Leighton. 2009. Multivariate faunal analyses of the Turonian Bissekty Formation: variation in the degree of marine influence in temporally and spatially averaged fossil assemblages. Palaios 24: 18–26.
Roberts, D.W. 2010. Labdsv: ordination and multivariate analysis for ecology. R package version 1.4–1. http://CRAN.R-pro-ject.org/package=labdsv; accessed on 15 March 2010
Roleček, J., L. Tichý, D. Zelený and M. Chytrý. 2009. Modified TWINSPAN classification in which the hierarchy respects cluster heterogeneity. J. Veg. Sci. 20: 596–602.
Schulze, R.E. 1997. South African Atlas for Agrohydrology and Climatology. Water Research Commission, Pretoria, Report TT82/96.
Schmidtlein, S., L. Tichý, H. Feilhauer and U. Faude. 2010. A brute-force approach to vegetation classification. J. Veg. Sci. 21: 1162–1171.
Tamás, J., J. Podani and P. Csontos. 2001. An extension of presence/ absence coefficients to abundance data: a new look at absence. J. Veg. Sci. 12: 401–410.
Tichý, L. 2002. JUICE, software for vegetation classification. J. Veg. Sci. 13: 451–453.
Tichý, L. and J. Holt. 2006. JUICE, Program for Management, Analysis and Classification of Ecological Data. User’s Manual. Masaryk University, Brno.
Tichý, L., M. Chytrý, M. Hájek, S. Talbot and Z. Botta-Dukát. 2009. OptimClass: Using species-to-cluster fidelity to determine the optimal partition in classification of ecological communities. J. Veg. Sci. 21: 287–299.
van der Maarel, E. 1979. Transformation of cover-abundance values in phytosociology and its effects on community similarity. Vegetatio 39: 97–114.
van der Maarel, E. 2007. Transformation of cover-abundance values for appropriate numerical treatment – alternatives to the proposals by Podani. J. Veg. Sci. 8: 767–770.
van Groenewoud, H. 1992. The robustness of Correspondence, De-trended Correspondence, and TWINSPAN Analysis. J. Veg. Sci. 3: 239–246.
Wildi, O. 2010. Data Analysis in Vegetation Ecology. Wiley, Chich-ester, UK.
Williams, N.E. 2010. Restoration of nontarget species: bee communities and pollination function in riparian forests. Restor. Ecol. 19: 450–459.
Wolda, H. 1981. Similarity indices, sample size and diversity. Oe-cologia 50: 296–302.
Electronic supplementary material
About this article
Cite this article
Lötter, M.C., Mucina, L. & Witkowski, E.T.F. The classification conundrum: species fidelity as leading criterion in search of a rigorous method to classify a complex forest data set. COMMUNITY ECOLOGY 14, 121–132 (2013). https://doi.org/10.1556/ComEc.14.2013.1.13
- Cluster analysis
- JUICE software
- Vegetation classification