Cluster Analysis

Aggarwal, Charu C.

doi:10.1007/978-3-319-14142-8_6

Charu C. Aggarwal²

328k Accesses
2 Citations

Abstract

Many applications require the partitioning of data points into intuitively similar groups. The partitioning of a large number of data points into a smaller number of groups helps greatly in summarizing the data and understanding it for a variety of data mining applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Hardcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For a fixed cluster assignment \({\cal C}_1 \ldots {\cal C}_k\), the gradient of the clustering objective function \(\sum _{j=1}^k \sum _{ \overline {X_i} \in {\cal C}_j} || \overline {X_i} - \overline {Y_j}||^2\) with respect to \(\overline {Y_j}\) is \(2 \sum _{ \overline {X_i} \in {\cal C}_j} (\overline {X_i} - \overline {Y_j})\). Setting the gradient to 0 yields the mean of cluster \({\cal C}_j\) as the optimum value of \(\overline {Y_j}\). Note that the other clusters do not contribute to the gradient, and, therefore, the approach effectively optimizes the local clustering objective function for \({\cal C}_j\).
2.
http://www.dmoz.org
3.
This is achieved by setting the partial derivative of \({\cal L}({\cal D}|{\cal M})\) (see Eq. 6.12) with respect to each parameter in \(\overline {\mu _i}\) and \(\sigma \) to 0.
4.
The parameter \(MinPts\) is used in the original DBSCAN description. However, the notation τ is used here to retain consistency with the grid-clustering description.
5.
See [257], which is a graph-based alternative to the LOF algorithm for locality-sensitive outlier analysis.

Author information

Authors and Affiliations

IBM T.J. Watson Research Center, Yorktown Heights, New York, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charu C. Aggarwal .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C. (2015). Cluster Analysis. In: Data Mining. Springer, Cham. https://doi.org/10.1007/978-3-319-14142-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-14142-8_6
Published: 14 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14141-1
Online ISBN: 978-3-319-14142-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics