Advertisement

Non-probabilistic Classification

  • John C. Gower
  • Gavin J. S. Ross
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Much of the classification literature ignores notions of probability. In our view, this is due in part to a dominant tendency in the early days of computers for developing heuristic clustering algorithms and in part due to long traditions in classification outside the statistical/probabilistic orbit, of which biological taxonomy and book classification are primary examples. Statisticians have rightly stressed the role of probabilistic concepts in formulating classification problems and in interpreting classifications but we believe that they are wrong in suggesting, as they sometimes seem to, that other approaches are unsatisfactory. Probability has its proper place in classification but it is neither an essential nor always an appropriate tool. We discuss circumstances where non- probabilistically-based classifications are fully justified.

Considerations influencing the differences between the two approaches include: 1) Irrespective of whether things are to be assembled into classes (arranged hierarchically or not) or assigned to previously recognised classes, methodology depends on whether the things may be regarded as representing groups or as samples from groups; 2) Models are basic to the formulation of statistically based classifications, but they may also underpin nonprobabilistic classifications; overt models are not a characteristic of heuristic classification algorithms; 3) In principle, probabilistic models allow the significance and number of clusters justified by data to be assessed. In non-probabilistic classifications (probabilistic too), the eighteenth century concept of approximation offers a good basis for assessing the adequacy and stability of clusters.

Keywords

Probabilistic Classification Non-probabilistic Classification Classes Groups Assignment Class Construction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gilmour, J. S. L. (1937). A taxonomic problem. Nature, 134, 1040–1042.CrossRefGoogle Scholar
  2. Gower, J. C. (1971). A general coefficient of similarity and some of its properties. Biometrics, 27, 857–871.CrossRefGoogle Scholar
  3. Gower, J. C. (1975). Maximal predictive classification. Biometrics, 30, 643–654.CrossRefGoogle Scholar
  4. Gower, J. C. (1998). Classification in: Encyclopaedia of Biostatistics, Armitage, P. and Colton, T. (Eds.), Wiley, Chichester, (in press).Google Scholar
  5. Payne, R. W. and Preece, D. A. (1980). Identification keys and diagnostic tables: a review (with discussion). R. Statist. Sac. A., 143, 253–292.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1998

Authors and Affiliations

  • John C. Gower
    • 1
  • Gavin J. S. Ross
    • 2
  1. 1.Department of StatisticsThe Open UniversityMilton KeynesUK
  2. 2.Statistics DepartmentIACR RothamstedHarpenden, HertsUK

Personalised recommendations