How Many Clusters? An Investigation of Five Procedures for Detecting Nested Cluster Structure

Gordon, A. D.

doi:10.1007/978-4-431-65950-1_9

How Many Clusters? An Investigation of Five Procedures for Detecting Nested Cluster Structure

A. D. Gordon⁸

Conference paper

2017 Accesses
5 Citations

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Summary

The paper addresses the problem of identifying relevant values for the number of clusters present in a data set. The problem has usually been tackled by searching for a best partition using so-called stopping rules. It is argued that it can be of interest to detect cluster structure at several different levels, and five stopping rules that performed well in a previous investigation are modified for this purpose. The rules are assessed by their performance in the analysis of simulated data sets which contain nested cluster structure.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Beale, E. NI. L. (1969): Euclidean cluster analysis. Bulletin of the International Statistical Institute, 43(2), 92–94.
Google Scholar
Bock, H. H. (1996): Probability models and hypotheses testing in partitioning cluster analysis. In Clustering and Classification, Arabie, P., Hubert, L. J. and De Soete, G. (eds.), 377–453, World Scientific, River Edge, NJ.
Chapter Google Scholar
Calinski, T. and Harabasz, J. (1974): A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.
MathSciNet MATH Google Scholar
Cooper, M. C. and Milligan, G. W. (1988): The effect of measurement error on determining the number of clusters in cluster analysis. In Data, Expert Knowledge and Decisions, Gaul. W. and Schader, M. (eds.), 319–328, Springer-Verlag, Berlin.
Google Scholar
Duda, R. O. and Hart, P. E. (1973): Pattern Classification and Scene Analysis. Wiley, New York.
MATH Google Scholar
Goodman, L. A. and Kruskal, W. H. (1954): Measures of association for cross-classifications. Journal of the American Statistical Association, 49, 732–764.
MATH Google Scholar
Gordon, A. D. (1996): Cluster validation. Paper presented at IFCS-96 Conference, Kobe, 27–30 March, 1996.
Google Scholar
Hubert, L. (1974): Approximate evaluation techniques for the single-link and complete-link hierarchical clustering procedures. Journal of the American Statistical Association, 69, 698–704.
Article MathSciNet MATH Google Scholar
Jain, A. K. and Dubes, R. C. (1988): Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs, NJ.
Google Scholar
Milligan, G. W. and Cooper, M. C. (1985): An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.
Google Scholar
Scott, A. J. and Symons, M. J. (1971): Clustering methods based on likelihood ratio criteria. Biometrics, 27, 387–397.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Mathematical Institute, University of St Andrews, North Haugh, St Andrews, KY16 9SS, Scotland
A. D. Gordon

Authors

A. D. Gordon
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan
Chikio Hayashi , Noboru Ohsumi & Yasumasa Baba , &
School of Management, Science University of Tokyo, 500 Shimokiyoku, Kuki, Saitama 346, Japan
Keiji Yajima
Institut für Statistik, Rheinisch-Westfälische Technische Hochschule (RWTH), D-52056, Aachen, Germany
Hans-Hermann Bock
Faculty of Environmental Science & Technology, Okayama University, 2-1-1 Tsushima-naka, Okayama 700, Japan
Yutaka Tanaka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gordon, A.D. (1998). How Many Clusters? An Investigation of Five Procedures for Detecting Nested Cluster Structure. In: Hayashi, C., Yajima, K., Bock, HH., Ohsumi, N., Tanaka, Y., Baba, Y. (eds) Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Tokyo. https://doi.org/10.1007/978-4-431-65950-1_9

Download citation

DOI: https://doi.org/10.1007/978-4-431-65950-1_9
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-70208-5
Online ISBN: 978-4-431-65950-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics