Skip to main content

How Many Clusters? An Investigation of Five Procedures for Detecting Nested Cluster Structure

  • Conference paper

Summary

The paper addresses the problem of identifying relevant values for the number of clusters present in a data set. The problem has usually been tackled by searching for a best partition using so-called stopping rules. It is argued that it can be of interest to detect cluster structure at several different levels, and five stopping rules that performed well in a previous investigation are modified for this purpose. The rules are assessed by their performance in the analysis of simulated data sets which contain nested cluster structure.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Beale, E. NI. L. (1969): Euclidean cluster analysis. Bulletin of the International Statistical Institute, 43(2), 92–94.

    Google Scholar 

  • Bock, H. H. (1996): Probability models and hypotheses testing in partitioning cluster analysis. In Clustering and Classification, Arabie, P., Hubert, L. J. and De Soete, G. (eds.), 377–453, World Scientific, River Edge, NJ.

    Chapter  Google Scholar 

  • Calinski, T. and Harabasz, J. (1974): A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.

    MathSciNet  MATH  Google Scholar 

  • Cooper, M. C. and Milligan, G. W. (1988): The effect of measurement error on determining the number of clusters in cluster analysis. In Data, Expert Knowledge and Decisions, Gaul. W. and Schader, M. (eds.), 319–328, Springer-Verlag, Berlin.

    Google Scholar 

  • Duda, R. O. and Hart, P. E. (1973): Pattern Classification and Scene Analysis. Wiley, New York.

    MATH  Google Scholar 

  • Goodman, L. A. and Kruskal, W. H. (1954): Measures of association for cross-classifications. Journal of the American Statistical Association, 49, 732–764.

    MATH  Google Scholar 

  • Gordon, A. D. (1996): Cluster validation. Paper presented at IFCS-96 Conference, Kobe, 27–30 March, 1996.

    Google Scholar 

  • Hubert, L. (1974): Approximate evaluation techniques for the single-link and complete-link hierarchical clustering procedures. Journal of the American Statistical Association, 69, 698–704.

    Article  MathSciNet  MATH  Google Scholar 

  • Jain, A. K. and Dubes, R. C. (1988): Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs, NJ.

    Google Scholar 

  • Milligan, G. W. and Cooper, M. C. (1985): An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.

    Google Scholar 

  • Scott, A. J. and Symons, M. J. (1971): Clustering methods based on likelihood ratio criteria. Biometrics, 27, 387–397.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Japan

About this paper

Cite this paper

Gordon, A.D. (1998). How Many Clusters? An Investigation of Five Procedures for Detecting Nested Cluster Structure. In: Hayashi, C., Yajima, K., Bock, HH., Ohsumi, N., Tanaka, Y., Baba, Y. (eds) Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Tokyo. https://doi.org/10.1007/978-4-431-65950-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-65950-1_9

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-70208-5

  • Online ISBN: 978-4-431-65950-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics