Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Database Clustering Methods

Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_550

Synonyms

Similarity-based data partitioning

Definitions

Given a database D = {t1, t2, … , tn}, of tuples and a user defined similarity function s, 0 ≤ s(ti, tj) ≤ 1 , ti , tjD, the database clustering problem is defined as a partitioning process, such that D can be partitioned into a number of (such as k) subsets (k can be given), as C1 , C2 , . . . , Ck, according to s by assigning each tuple in D to a subset Ci. Ci is called a cluster such that Ci = {tis(ti, tr) ≥ s(ti, ts), if ti, trCjand tsCj}.

Key Points

Database clustering is a process to group data objects (referred as tuples in a database) together based on a user defined similarity function. Intuitively, a cluster is a collection of data objects that are “similar” to each other when they are in the same cluster and “dissimilar” when they are in different clusters. Similarity can be defined in many different ways such as Euclidian distance, Cosine, or the dot product. For data objects, their membership belonging to a...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Ester M, Kriegel HP, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining; 1996. p. 226–31.Google Scholar
  2. 2.
    Han J, Kamber M, Tung AKH. 1Spatial clustering methods in data mining: a survey. In: Miller H, Han J, editors. Geographic data mining and knowledge discovery. London: Taylor and Francis; 2001.Google Scholar
  3. 3.
    Zhang T, Ramakrishnan R, Livny M. Birch: an efficient data clustering method for very large databases, Quebec. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data; 1996. p. 103–14.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.The University of QueenslandBrisbaneAustralia

Section editors and affiliations

  • Xiaofang Zhou
    • 1
  1. 1.School of Inf. Tech. & Elec. Eng.Univ. of QueenslandBrisbaneAustralia