Advertisement

Clustering Large Datasets of Mixed Units

  • Simona Korenjak-Černe
  • Vladimir Batagelj
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

In this paper we propose an approach for clustering large datasets of mixed units based on representation of clusters by distributions of values of variables over a cluster — histograms, that are compatible with merging of clusters. The proposed representation can be used also for clustering symbolic data. On the basis of this representation the adapted versions of leaders method and adding method were implemented. The proposed approach was successfully applied to several large datasets.

Keywords

large datasets clustering mixed units distribution description compatible with merging of clusters leaders method adding method 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Batagelj, V. (1985). Notes on the dynamic clusters method, in: IV conference on applied mathematics, Split, May 28-30, 1984. University of Split, Split, 139–146.Google Scholar
  2. Batagelj, V. & Bren, M. (1995). Comparing Resemblance Measures, Journal of Classification, 12, 1, 73–90.CrossRefGoogle Scholar
  3. Batagelj, V. & Mandelj, M. (1993). Adding Clustering Algorithm Based on L-W-J Formula, Paper presented at: IFCS 93, Paris, 31.aug–4.sep 1993.Google Scholar
  4. Brucker, P. (1978). On the complexity of clustering problems, Lecture Notes in Economics and Mathematical Systems 175, in: Optimization and Operations Research, Proceedings, Bonn. Henn,R., Korte,B., Oettli,W. (Eds.), Springer- Verlag, Berlin 1978.Google Scholar
  5. Diday, E. (1979). Optimisation en classification automatique, Tome l.,2.. INRIA, Rocquencourt, (in French).Google Scholar
  6. Diday, E. (1997). Extracting Information from Extensive Data sets by Symbolic Data Analysis, in: Indo-French Workshop on Symbolic Data Analysis and its Applications, Paris, 23-24. September 1997, Paris IX, Dauphine, 3–12.Google Scholar
  7. Hartigan, J.A. (1975). Clustering Algorithms, Wiley, New York.Google Scholar
  8. Tukey, J.W. (1977). Exploratory Data Analysis, Addison-Wesley, Reading, MA.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1998

Authors and Affiliations

  • Simona Korenjak-Černe
    • 1
    • 2
  • Vladimir Batagelj
    • 1
    • 2
  1. 1.Faculty of Mathematics and PhysicsUniversity of LjubljanaSlovenia
  2. 2.Dept. of TCSInstitute of Mathematics, Physics and MechanicsLjubljanaSlovenia

Personalised recommendations