Advertisement

Data Analytics pp 103-122 | Cite as

Clustering

  • Thomas A. Runkler
Chapter

Abstract

Clustering is unsupervised learning that assigns labels to objects in unlabeled data. When clustering is performed on data that do have physical classes, the clusters may or may not correspond with the physical classes. Cluster partitions may be mathematically represented by sets, partition matrices, and/or cluster prototypes. Sequential clustering (single linkage, complete linkage, average linkage, Ward’s method, etc.) is easily implemented but computationally expensive. Partitional clustering can be based on hard, fuzzy, possibilistic, or noise clustering models. Cluster prototypes can take many forms such as hyperspheric, ellipsoidal, linear, circles, or more complex shapes. Relational clustering models find clusters in relational data. Complex relational clusters can be found by kernelization. Cluster tendency assessment finds out if the data contain clusters at all, and cluster validity measures help to identify an appropriate number of clusters. Clustering can also be done by heuristic methods such as the self-organizing map.

Keywords

Cluster Center Cluster Model Fuzzy Cluster Reference Vector Cluster Partition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    G. B. Ball and D. J. Hall. Isodata, an iterative method of multivariate analysis and pattern classification. In IFIPS Congress, 1965.Google Scholar
  2. 2.
    J. C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, 1981.MATHCrossRefGoogle Scholar
  3. 3.
    J. C. Bezdek. Fuzzy models what are they, and why? IEEE Transactions on Fuzzy Systems, 1(1):1–6, 1993.MathSciNetCrossRefGoogle Scholar
  4. 4.
    J. C. Bezdek, C. Coray, R. Gunderson, and J. Watson. Detection and characterization of cluster substructure, I. Linear structure: Fuzzy c–lines. SIAM Journal on Applied Mathematics, 40(2):339–357, April 1981.MathSciNetMATHCrossRefGoogle Scholar
  5. 5.
    J. C. Bezdek, C. Coray, R. Gunderson, and J. Watson. Detection and characterization of cluster substructure, II. Fuzzy c–varieties and convex combinations thereof. SIAM Journal on Applied Mathematics, 40(2):358–372, April 1981.MathSciNetMATHCrossRefGoogle Scholar
  6. 6.
    J. C. Bezdek and R. J. Hathaway. Optimization of fuzzy clustering criteria using genetic algorithms. In IEEE Conference on Evolutionary Computation, Orlando, volume 2, pages 589–594, June 1994.Google Scholar
  7. 7.
    J. C. Bezdek, J. M. Keller, R. Krishnapuram, and N. R. Pal. Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Kluwer, Norwell, 1999.MATHGoogle Scholar
  8. 8.
    R. N. Davé. Fuzzy shell clustering and application to circle detection in digital images. International Journal on General Systems, 16:343–355, 1990.CrossRefGoogle Scholar
  9. 9.
    R. N. Davé. Characterization and detection of noise in clustering. Pattern Recognition Letters, 12:657–664, 1991.CrossRefGoogle Scholar
  10. 10.
    W. H. E. Day and H. Edelsbrunner. Efficient algorithms for agglomerative hierarchical clustering methods. Journal of Classification, 1(1):7–24, 1984.MATHCrossRefGoogle Scholar
  11. 11.
    H. Fang and Y. Saad. Farthest centroids divisive clustering. In International Conference on Machine Learning and Applications, pages 232–238, 2008.Google Scholar
  12. 12.
    M. Girolami. Mercer kernel–based clustering in feature space. IEEE Transactions on Neural Networks, 13:780–784, 2002.CrossRefGoogle Scholar
  13. 13.
    E. E. Gustafson and W. C. Kessel. Fuzzy clustering with a covariance matrix. In IEEE International Conference on Decision and Control, San Diego, pages 761–766, 1979.Google Scholar
  14. 14.
    R. J. Hathaway and J. C. Bezdek. NERF c–means: Non–Euclidean relational fuzzy clustering. Pattern Recognition, 27:429–437, 1994.CrossRefGoogle Scholar
  15. 15.
    R. J. Hathaway and J. C. Bezdek. Optimization of clustering criteria by reformulation. IEEE Transactions on Fuzzy Systems, 3(2):241–245, May 1995.CrossRefGoogle Scholar
  16. 16.
    R. J. Hathaway, J. W. Davenport, and J. C. Bezdek. Relational duals of the c–means algorithms. Pattern Recognition, 22:205–212, 1989.MathSciNetMATHCrossRefGoogle Scholar
  17. 17.
    R. J. Hathaway, J. M. Huband, and J. C. Bezdek. Kernelized non–Euclidean relational fuzzy c–means algorithm. In IEEE International Conference on Fuzzy Systems, pages 414–419, Reno, May 2005.Google Scholar
  18. 18.
    A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, 1988.MATHGoogle Scholar
  19. 19.
    A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, 1999.CrossRefGoogle Scholar
  20. 20.
    T. Kohonen. Automatic formation of topological maps of patterns in a self–organizing system. In E. Oja and O. Simula, editors, Scandinavian Conference on Image Analysis, pages 214–220, Helsinki, 1981.Google Scholar
  21. 21.
    T. Kohonen. Self–Organizing Maps. Springer, Berlin, 2001.MATHCrossRefGoogle Scholar
  22. 22.
    R. Krishnapuram, A. Joshi, O. Nasraoui, and L. Yi. Low–complexity fuzzy relational clustering algorithms for web mining. IEEE Transactions on Fuzzy Systems, 9(4):595–607, August 2001.CrossRefGoogle Scholar
  23. 23.
    R. Krishnapuram and J. M. Keller. A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems, 1(2):98–110, May 1993.CrossRefGoogle Scholar
  24. 24.
    T. A. Runkler. The effect of kernelization in relational fuzzy clustering. In GMA/GI Workshop Fuzzy Systems and Computational Intelligence, Dortmund, pages 48–61, November 2006.Google Scholar
  25. 25.
    T. A. Runkler. Kernelized non–euclidean relational possibilistic c–means clustering. In IEEE Three Rivers Workshop on Soft Computing in Industrial Applications, Passau, August 2007.Google Scholar
  26. 26.
    T. A. Runkler. Relational fuzzy clustering. In J. Valente de Oliveira and W. Pedrycz, editors, Advances in Fuzzy Clustering and its Applications, chapter 2, pages 31–52. Wiley, 2007.Google Scholar
  27. 27.
    T. A. Runkler. Wasp swarm optimization of the c-means clustering model. International Journal of Intelligent Systems, 23(3):269–285, February 2008.MATHCrossRefGoogle Scholar
  28. 28.
    J. Sander, M. Ester, H.-P. Kriegel, and X. Xu. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 2(2):169–194, 1998.CrossRefGoogle Scholar
  29. 29.
    B. Schölkopf, A.J. Smola, and K. R. Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10:1299–1319, 1998.CrossRefGoogle Scholar
  30. 30.
    P. Sneath and R. Sokal. Numerical Taxonomy. Freeman, San Francisco, 1973.MATHGoogle Scholar
  31. 31.
    J. H. Ward. Hierarchical grouping to optimize an objective function. Journal of American Statistical Association, 58(301):236–244, 1963.CrossRefGoogle Scholar
  32. 32.
    D. J. Willshaw and C. von der Malsburg. How patterned neural connections can be set up by self-organization. Proceedings of the Royal Society London, B194:431–445, 1976.Google Scholar
  33. 33.
    M. P. Windham. Cluster validity for the fuzzy c–means clustering algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-4(4):357–363, July 1982.CrossRefGoogle Scholar
  34. 34.
    Z.-D. Wu, W.-X. Xie, and J.-P. Yu. Fuzzy c–means clustering algorithm based on kernel method. In International Conference on Computational Intelligence and Multimedia Applications, pages 49–54, Xi’an, 2003.Google Scholar
  35. 35.
    D.-Q. Zhang and S.-C. Chen. Fuzzy clustering using kernel method. In International Conference on Control and Automation, pages 123–127, 2002.Google Scholar
  36. 36.
    D.-Q. Zhang and S.-C. Chen. Clustering incomplete data using kernel–based fuzzy c–means algorithm. Neural Processing Letters, 18:155–162, 2003.CrossRefGoogle Scholar
  37. 37.
    D.-Q. Zhang and S.-C. Chen. Kernel–based fuzzy and possibilistic c–means clustering. In International Conference on Artificial Neural Networks, pages 122–125, Istanbul, 2003.Google Scholar
  38. 38.
    R. Zhang and A.I. Rudnicky. A large scale clustering scheme for kernel k–means. In International Conference on Pattern Recognition, pages 289–292, Quebec, 2002.Google Scholar

Copyright information

© Vieweg+Teubner Verlag | Springer Fachmedien Wiesbaden 2012

Authors and Affiliations

  1. 1.MünchenGermany

Personalised recommendations