Explanatory Variables in Classifications and the Detection of the Optimum Number of Clusters

  • János Podani
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


An ordinal approach to the a posteriori evaluation of the explanatory power of variables in classifications is proposed. The contribution of each variable is assessed in a way fully compatible with the distance or dissimilarity function used in the clustering process. Then, a simple ranking-based measure is applied to express the relative agreement or disagreement of variables with a given partition. This measure treats all variables equally, no matter how influential they were when the classification was actually created. The sum of measures for all variables reflects their overall agreement and can be used to select an optimal partition from a hierarchical classification.


Explanatory Power Rank Order Hierarchical Classification Vegetational Plot Relative Agreement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Anderberg, M. R. (1973): Cluster Analysis for Applications. Academic, New York.MATHGoogle Scholar
  2. Dale, M. B., Beatrice, M., Venanzoni, R. and Ferrari, C. (1986): A comparison of some methods of selecting species in vegetation analysis. Coenoses, 1, 35–52.Google Scholar
  3. Fowlkes, E. B., Gnanadesikan, R. and Kettenring, J. R. (1988): Variable selection in clustering. Journal of Classification, 5, 205–228.MathSciNetCrossRefGoogle Scholar
  4. Godehardt, E. (1990): Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis ( 2nd ed. ). Vieweg & Sohn, Braunschweig.Google Scholar
  5. Gordon, A. D. (1981): Classification: Methods for the Exploratory Analysis of Multivariate Data. Chapman and Hall, London.MATHGoogle Scholar
  6. Jancey, R. C. and Wells, T. C. (1987): Locality theory: the phenomenon and its significance. Coenoses, 2, 31–37.Google Scholar
  7. Lance, G. N. and Williams, W. T. (1977): Attribute contributions to a classification. Australian Computer Journal, 9, 128–129.Google Scholar
  8. Milligan, G. W. and Cooper, M. C. (1985): An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.CrossRefGoogle Scholar
  9. Orldci. L. (1973): Ranking characters by a dispersion criterion. Nature, 244, 371–373.Google Scholar
  10. Orlóci, L. ( 1978 ): Multivariate Analysis in Vegetation Research. Junk, The Hague.Google Scholar
  11. Podani, J. (1985): Syntaxonomic congruence in a small-scale vegetation survey. Abstracta Botanica, 9, 99–128.Google Scholar
  12. Podani, J. (1994): Multivariate Data Analysis in Ecology and Systematics. SPB Publishing, The Hague.Google Scholar
  13. Ratkowsky, D. A. and Lance, G. N. (1978): A criterion for determining the number of groups in a classification. Australian Computer Journal, 10, 1 15–1 17.Google Scholar
  14. Sneath, P.H.A. and Sokal, R. R. (1973): Numerical Taxonomy. Freeman, San Francisco.MATHGoogle Scholar
  15. Stephenson, W. and Cook, S. D. (1980): Elimination of species before cluster analysis. Australian Journal of Ecology, 5, 263–273.CrossRefGoogle Scholar

Copyright information

© Springer Japan 1998

Authors and Affiliations

  • János Podani
    • 1
  1. 1.Department of Plant Taxonomy and EcologyLoránd Eötvös UniversityBudapestHungary

Personalised recommendations