Abstract
Starting from model-based clustering simple techniques based on cores are proposed. A core is a dense region in the high-dimensional space that, for example, can be represented by its most typical observation, by its centroid or, more generally, by assigning weight functions to the observations. Well-known cluster analysis techniques like the partitional K-Means or the hierarchical Ward are useful for discovering partitions or hierarchies in the underlying data. Here these methods are generalised in two ways, firstly by using weighted observations and secondly by allowing different volumes of clusters. Then a more general K-Means approach based on pair-wise distances is recommended. Simulation studies are carried out in order to compare the new clustering techniques with the well-known ones. Moreover, a successful application is presented. Here the task is to discover clusters with quite different number of observations in a high-dimensional space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
BANFIELD, J.D. and RAFTERY, A.E. (1993): Model-Based Gaussian and Non-Gaussian Clustering. Biometrics, 49, 803–821.
BREIMAN, L. (1996): Bias, Variance, and Arcing Classifiers. Technical Report, 460. Statistical Department, University of California, Berkeley.
FRALEY C. (1998): Algorithms for Model-Based Gaussian Hierarchical Clustering. Siam J. Sci. Comput., 20, No.1, 270–281.
FRALEY, C. and RAFTERY, A.E. (2002): Model-based Clustering, Discriminant Analysis, and Density Estimation. JASA, 97, No. 458, 611–631.
GORDON, A. D. (1999): Classification. Chapman & Hall/CRC, London.
GORDON, A. D. and DE CATA, A. (1988): Stability and Influence in Sum of Squares Clustering. Metron, 46, 347–360.
GUHA, S., RASTOGI, R., and SHIM, K. (1998): CURE: An Efficient Clustering Algorithm for Large Databases. In: Proc. SIGMOD. ACM, Seattle, 73–84.
HAMPLEL, F. (1968): Contributions to the Theory of Robust Estimation. Ph.D. thesis, University of California, Berkeley.
HUBERT, L.J. and ARABIE, P. (1985): Comparing Partitions. Journal of Classification, 2, 193–218.
JAIN, A.K. and DUBES, R.C. (1988): Algorithms for Clustering Data. Prentice Hall, New Jersey.
KAUFMAN, L. and ROUSSEEUW, P.J. (1990): Finding Groups in Data. Wiley, New York.
MACQUEEN, J.B. (1967): Some Methods for Classification and Analysis of Multivariate Observations. In: L. Lecam and J. Neyman (Eds.): Proc. 5th Berkeley Symp. Math. Statist. Prob., Vol. 1. Univ. California Press, Berkeley, 281–297.
MARDIA, K.V., KENT, J.T., and BIBBY, J.M. (1979): Multivariate Analysis. Academic Press, London.
MUCHA, H.-J. (1992): Clusteranalyse mit Mikrocomputern. Akademie Verlag, Berlin.
MUCHA, H.-J. (1995). XClust: Clustering in an Interactive Way. In: W. HärdIe, S. Klinke, and B.A. Turlach (Eds.): XploRe: An Interactive Statistical Computing Environment. Springer, New York, 141–168.
MUCHA, H.-J., BARTEL, H.-G., and DOLATA, J. (2002): Exploring Roman Brick and Tile by Cluster Analysis with Validation of Results. In: W. Gaul and G. Ritter (Eds.): Classification, Automation, and New Media. Springer, Heidelberg, 471–478.
RAND, W.M. (1971): Objective Criteria for the Evaluation of Clustering Methods. JASA, 66, 846–850.
SPATH, H. (1985): Cluster Dissection and Analysis. Ellis Horwood, Chichester.
WARD, J.H. (1963): Hierarchical Grouping Methods to Optimise an Objective Function. JASA, 58, 235–244.
ZHANG, T., RAMAKRISHNAN, R., and LIVNY, M. (1996): Birch: An efficient clustering method for very large databases. In: Proc. SIGMOD. ACM Press, Montreal, 103–114.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mucha, HJ., Bartel, HG., Dolata, J. (2003). Core-Based Clustering Techniques. In: Schader, M., Gaul, W., Vichi, M. (eds) Between Data Science and Applied Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18991-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-18991-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40354-8
Online ISBN: 978-3-642-18991-3
eBook Packages: Springer Book Archive