Abstract
In this paper we propose an approach for clustering large datasets of mixed units based on representation of clusters by distributions of values of variables over a cluster — histograms, that are compatible with merging of clusters. The proposed representation can be used also for clustering symbolic data. On the basis of this representation the adapted versions of leaders method and adding method were implemented. The proposed approach was successfully applied to several large datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Batagelj, V. (1985). Notes on the dynamic clusters method, in: IV conference on applied mathematics, Split, May 28-30, 1984. University of Split, Split, 139–146.
Batagelj, V. & Bren, M. (1995). Comparing Resemblance Measures, Journal of Classification, 12, 1, 73–90.
Batagelj, V. & Mandelj, M. (1993). Adding Clustering Algorithm Based on L-W-J Formula, Paper presented at: IFCS 93, Paris, 31.aug–4.sep 1993.
Brucker, P. (1978). On the complexity of clustering problems, Lecture Notes in Economics and Mathematical Systems 175, in: Optimization and Operations Research, Proceedings, Bonn. Henn,R., Korte,B., Oettli,W. (Eds.), Springer- Verlag, Berlin 1978.
Diday, E. (1979). Optimisation en classification automatique, Tome l.,2.. INRIA, Rocquencourt, (in French).
Diday, E. (1997). Extracting Information from Extensive Data sets by Symbolic Data Analysis, in: Indo-French Workshop on Symbolic Data Analysis and its Applications, Paris, 23-24. September 1997, Paris IX, Dauphine, 3–12.
Hartigan, J.A. (1975). Clustering Algorithms, Wiley, New York.
Tukey, J.W. (1977). Exploratory Data Analysis, Addison-Wesley, Reading, MA.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Korenjak-Černe, S., Batagelj, V. (1998). Clustering Large Datasets of Mixed Units. In: Rizzi, A., Vichi, M., Bock, HH. (eds) Advances in Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-72253-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-72253-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64641-9
Online ISBN: 978-3-642-72253-0
eBook Packages: Springer Book Archive