The Effects of Initial Values and the Covariance Structure on the Recovery of some Clustering Methods
Some clustering methods are compared in a simulation study. The data used in the analysis are generated in a mixture modeling framework. The methods included are some hierarchical methods, A:-means as implemented in the FASTCLUS procedure of SAS and cluster analysis by means of normal mixtures with the NORMIX program. We demonstrate that the poor recovery found in some studies for normal mixture type of clustering is partly due to the use of bad initial values, and partly due to the specification of covariance structure within the cluster. We further find that an important factor in the relative success of FASTCLUS lies in the initial seed selection.
KeywordsCovariance Structure Cluster Centroid True Cluster Normal Mixture Hierarchical Cluster Method
Unable to display preview. Download preview PDF.
- EVERITT, B.S. (1974): Cluster Analysis. Heinemann Educational Books, London, UK.Google Scholar
- MCLACHLAN, G.J. and BASFORD, K.E. (1988): Mixture Models. Inference and applications to Clustering. Marcel Dekker, New York.Google Scholar
- MEZZICH, J. E. (1978): Evaluating clustering methods for psychiatric diagnosis. Biological Psychiatry, 13(2), 265–281.Google Scholar
- MILLIGAN, G.W. (1996): Clustering validation: Results and implications for applied analysis. In: G. De Soete, P. Arabie and L.J. Hubert (Eds.): Clustering and Classification. World Scientific Publ., River Edge, NJ, 341–375.Google Scholar
- PRICE L.J. (1993): Identifying cluster overlap with normix population membership probabilities. Multivariate Behavorial Research, 28(2). 235–262Google Scholar
- SAS Institute Inc. (1989): SAS/STAT User’s Guide, Version 6, Fourth Edition, Volume 1, ANOVA-FREQ. SAS Institute, Cary, NC.Google Scholar