Core-Based Clustering Techniques

Mucha, Hans-Joachim; Bartel, Hans-Georg; Dolata, Jens

doi:10.1007/978-3-642-18991-3_9

Hans-Joachim Mucha⁷,
Hans-Georg Bartel⁸ &
Jens Dolata⁹

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

826 Accesses
2 Citations

Abstract

Starting from model-based clustering simple techniques based on cores are proposed. A core is a dense region in the high-dimensional space that, for example, can be represented by its most typical observation, by its centroid or, more generally, by assigning weight functions to the observations. Well-known cluster analysis techniques like the partitional K-Means or the hierarchical Ward are useful for discovering partitions or hierarchies in the underlying data. Here these methods are generalised in two ways, firstly by using weighted observations and secondly by allowing different volumes of clusters. Then a more general K-Means approach based on pair-wise distances is recommended. Simulation studies are carried out in order to compare the new clustering techniques with the well-known ones. Moreover, a successful application is presented. Here the task is to discover clusters with quite different number of observations in a high-dimensional space.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

BANFIELD, J.D. and RAFTERY, A.E. (1993): Model-Based Gaussian and Non-Gaussian Clustering. Biometrics, 49, 803–821.
Article MathSciNet MATH Google Scholar
BREIMAN, L. (1996): Bias, Variance, and Arcing Classifiers. Technical Report, 460. Statistical Department, University of California, Berkeley.
Google Scholar
FRALEY C. (1998): Algorithms for Model-Based Gaussian Hierarchical Clustering. Siam J. Sci. Comput., 20, No.1, 270–281.
Article MathSciNet MATH Google Scholar
FRALEY, C. and RAFTERY, A.E. (2002): Model-based Clustering, Discriminant Analysis, and Density Estimation. JASA, 97, No. 458, 611–631.
Article MathSciNet MATH Google Scholar
GORDON, A. D. (1999): Classification. Chapman & Hall/CRC, London.
MATH Google Scholar
GORDON, A. D. and DE CATA, A. (1988): Stability and Influence in Sum of Squares Clustering. Metron, 46, 347–360.
Google Scholar
GUHA, S., RASTOGI, R., and SHIM, K. (1998): CURE: An Efficient Clustering Algorithm for Large Databases. In: Proc. SIGMOD. ACM, Seattle, 73–84.
Google Scholar
HAMPLEL, F. (1968): Contributions to the Theory of Robust Estimation. Ph.D. thesis, University of California, Berkeley.
Google Scholar
HUBERT, L.J. and ARABIE, P. (1985): Comparing Partitions. Journal of Classification, 2, 193–218.
Article Google Scholar
JAIN, A.K. and DUBES, R.C. (1988): Algorithms for Clustering Data. Prentice Hall, New Jersey.
MATH Google Scholar
KAUFMAN, L. and ROUSSEEUW, P.J. (1990): Finding Groups in Data. Wiley, New York.
Book Google Scholar
MACQUEEN, J.B. (1967): Some Methods for Classification and Analysis of Multivariate Observations. In: L. Lecam and J. Neyman (Eds.): Proc. 5th Berkeley Symp. Math. Statist. Prob., Vol. 1. Univ. California Press, Berkeley, 281–297.
Google Scholar
MARDIA, K.V., KENT, J.T., and BIBBY, J.M. (1979): Multivariate Analysis. Academic Press, London.
MATH Google Scholar
MUCHA, H.-J. (1992): Clusteranalyse mit Mikrocomputern. Akademie Verlag, Berlin.
MATH Google Scholar
MUCHA, H.-J. (1995). XClust: Clustering in an Interactive Way. In: W. HärdIe, S. Klinke, and B.A. Turlach (Eds.): XploRe: An Interactive Statistical Computing Environment. Springer, New York, 141–168.
Chapter Google Scholar
MUCHA, H.-J., BARTEL, H.-G., and DOLATA, J. (2002): Exploring Roman Brick and Tile by Cluster Analysis with Validation of Results. In: W. Gaul and G. Ritter (Eds.): Classification, Automation, and New Media. Springer, Heidelberg, 471–478.
Chapter Google Scholar
RAND, W.M. (1971): Objective Criteria for the Evaluation of Clustering Methods. JASA, 66, 846–850.
Article Google Scholar
SPATH, H. (1985): Cluster Dissection and Analysis. Ellis Horwood, Chichester.
Google Scholar
WARD, J.H. (1963): Hierarchical Grouping Methods to Optimise an Objective Function. JASA, 58, 235–244.
Google Scholar
ZHANG, T., RAMAKRISHNAN, R., and LIVNY, M. (1996): Birch: An efficient clustering method for very large databases. In: Proc. SIGMOD. ACM Press, Montreal, 103–114.
Google Scholar

Download references

Author information

Authors and Affiliations

Weierstraß-Institut für Angewandte Analysis und Stochastik, D-I0117, Berlin, Germany
Hans-Joachim Mucha
Institut für Chemie, Humboldt-Universität zu Berlin, D-12489, Berlin, Germany
Hans-Georg Bartel
Landesamt für Denkmalpflege Rheinland-Pfalz, D-55116, Mainz, Germany
Jens Dolata

Authors

Hans-Joachim Mucha
View author publications
You can also search for this author in PubMed Google Scholar
Hans-Georg Bartel
View author publications
You can also search for this author in PubMed Google Scholar
Jens Dolata
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Systems, University of Mannheim, Schloss, 68131, Mannheim, Germany
Martin Schader
Institute of Decision Theory, University of Karlsruhe, Kaiserstr. 12, 76128, Karlsruhe, Germany
Wolfgang Gaul
Department of Statistics, University of Rome, Piazzale Aldo Moro, 00185, Rome, Italy
Maurizio Vichi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mucha, HJ., Bartel, HG., Dolata, J. (2003). Core-Based Clustering Techniques. In: Schader, M., Gaul, W., Vichi, M. (eds) Between Data Science and Applied Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18991-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-18991-3_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40354-8
Online ISBN: 978-3-642-18991-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics