Skip to main content

Cluster Analysis: Advanced Concepts

  • Chapter
  • First Online:
Data Mining

Abstract

In the previous chapter, the basic data clustering methods were introduced. In this chapter, several advanced clustering scenarios will be studied, such as the impact of the size, dimensionality, or type of the underlying data. In addition, it is possible to obtain significant insights with the use of advanced supervision methods, or with the use of ensemble-based algorithms. In particular, two important aspects of clustering algorithms will be addressed:

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 89.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It is possible to store the sum of the values in \(\overline {SS}\) across the \(d\) dimensions in lieu of \(\overline {SS}\), without affecting the usability of the cluster feature. This would result in a cluster feature of size \((d+2)\) instead of \((2 \cdot d +1)\).

  2. 2.

    The original BIRCH algorithm proposes to use the pairwise root mean square (RMS) distance between cluster data points as the diameter. This is one possible measure of the intracluster distance. This value can also be shown to be computable from the CF vector as \(\sqrt {\frac { \sum _{i=1}^d ( 2 \cdot m \cdot SS_i - 2\cdot LS_i^2)}{m \cdot (m-1)}}\).

  3. 3.

    http://www.dmoz.org/.

  4. 4.

    See discussion in Chap. 6 about Fig. 6.14.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charu C. Aggarwal .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Aggarwal, C. (2015). Cluster Analysis: Advanced Concepts. In: Data Mining. Springer, Cham. https://doi.org/10.1007/978-3-319-14142-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14142-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14141-1

  • Online ISBN: 978-3-319-14142-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics