Clustering Large Datasets of Mixed Units

Korenjak-Černe, Simona; Batagelj, Vladimir

doi:10.1007/978-3-642-72253-0_6

Simona Korenjak-Černe^8,9 &
Vladimir Batagelj^8,9

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

893 Accesses
1 Citations

Abstract

In this paper we propose an approach for clustering large datasets of mixed units based on representation of clusters by distributions of values of variables over a cluster — histograms, that are compatible with merging of clusters. The proposed representation can be used also for clustering symbolic data. On the basis of this representation the adapted versions of leaders method and adding method were implemented. The proposed approach was successfully applied to several large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Batagelj, V. (1985). Notes on the dynamic clusters method, in: IV conference on applied mathematics, Split, May 28-30, 1984. University of Split, Split, 139–146.
Google Scholar
Batagelj, V. & Bren, M. (1995). Comparing Resemblance Measures, Journal of Classification, 12, 1, 73–90.
Article Google Scholar
Batagelj, V. & Mandelj, M. (1993). Adding Clustering Algorithm Based on L-W-J Formula, Paper presented at: IFCS 93, Paris, 31.aug–4.sep 1993.
Google Scholar
Brucker, P. (1978). On the complexity of clustering problems, Lecture Notes in Economics and Mathematical Systems 175, in: Optimization and Operations Research, Proceedings, Bonn. Henn,R., Korte,B., Oettli,W. (Eds.), Springer- Verlag, Berlin 1978.
Google Scholar
Diday, E. (1979). Optimisation en classification automatique, Tome l.,2.. INRIA, Rocquencourt, (in French).
Google Scholar
Diday, E. (1997). Extracting Information from Extensive Data sets by Symbolic Data Analysis, in: Indo-French Workshop on Symbolic Data Analysis and its Applications, Paris, 23-24. September 1997, Paris IX, Dauphine, 3–12.
Google Scholar
Hartigan, J.A. (1975). Clustering Algorithms, Wiley, New York.
Google Scholar
Tukey, J.W. (1977). Exploratory Data Analysis, Addison-Wesley, Reading, MA.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics and Physics, University of Ljubljana, Slovenia
Simona Korenjak-Černe & Vladimir Batagelj
Dept. of TCS, Institute of Mathematics, Physics and Mechanics, Jadranska 19, 1 000, Ljubljana, Slovenia
Simona Korenjak-Černe & Vladimir Batagelj

Authors

Simona Korenjak-Černe
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Batagelj
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Statisticà, Probabilità e Statistiche Applicate, Università di Roma “La Sapienza”, Piazzale Aldo Moro 5, I-00185, Roma, Italia
Alfredo Rizzi
Dipartimento di Metodi Quantitativi e Teoria Economica, Università “G. D’Annunzio” di Chieti, Viale Pindaro 42, I-65127, Pescara, Italia
Maurizio Vichi
Institut für Statistik und Wirtschaftsmathematik, Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Wüllnerstraße 3, D-52056, Aachen, Germany
Hans-Hermann Bock

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Korenjak-Černe, S., Batagelj, V. (1998). Clustering Large Datasets of Mixed Units. In: Rizzi, A., Vichi, M., Bock, HH. (eds) Advances in Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-72253-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-72253-0_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64641-9
Online ISBN: 978-3-642-72253-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics