Advertisement

Online Cluster Prototype Generation for the Gravitational Clustering Algorithm

  • Elizabeth León
  • Jonatan Gómez
  • Fabián Giraldo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7637)

Abstract

Data clustering is a popular data mining technique for discovering the structure of a data set. However, the power of the results depends on the nature of the clusters prototypes generated by the clustering technique. Some cluster algorithms just label the data points producing a prototype for the cluster as the full set of data points belonging to the clusters. Some techniques produce a single ’abstract’ data point as the model for the full cluster losing the information of the shape, size and structure of the cluster. This paper proposes an on-line cluster prototype generation mechanism for the Gravitational Clustering algorithm. The idea is to use the gravitational system dynamic and the inherent hierarchical property of the gravitational algorithm for determining some summarized prototypes of clusters at the same time the gravitational clustering algorithm is finding such clusters. In this way, a cluster is represented by several different ’abstract’ data points allowing the algorithm to find an appropriated representation of clusters that are found. The performance of the proposed mechanism is evaluated experimentally on two types of synthetic data sets: data sets with Gaussian clusters and with non parametric clusters. Our results show that the proposed mechanism is able to deal with noise, finds the appropriated number of clusters and finds an appropriated set of cluster prototypes.

Keywords

clustering gravitational hierarchical prototype 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ben, S., Jin, Z., Yang, J.: Guided Fuzzy Clustering with Multi-Prototypes. In: 2011 International Joint Conference on Neural Networks (IJCNN 2011), pp. 2430–2436. IEEE (2011)Google Scholar
  2. 2.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenun Press (1981)Google Scholar
  3. 3.
    Cormer, T., Leiserson, C., Rivest, R.: Introduction to Algorithms. McGraw Hill (1990)Google Scholar
  4. 4.
    Gomez, J., Dasgupta, D., Nasraoui, O.: A New Gravitational Clustering Algorithm. In: 3rd SIAM International Conference on Data Mining (SDM 2003), vol. 3, pp. 83–94 (2003)Google Scholar
  5. 5.
    Gomez, J., Nasraoui, O., Leon, E.: RAIN – Data Clustering using Randomized Interactions between Data Points. In: 3rd International Conference on Machine Learning and Applications (ICMLA 2004), pp. 250–255 (2004)Google Scholar
  6. 6.
    Han, J., Kamber, M.: Data Mining – Concepts and Techniques. Morgan Kaufmann (2000)Google Scholar
  7. 7.
    Jain, A.K.: Data Clustering – 50 Years Beyond K-Means. Pattern Recognition Letters 31(8), 651–666 (2010)CrossRefGoogle Scholar
  8. 8.
    Karypis, G., Han, E., Kumar, V.: CHAMELEON – A Hierarchical Clustering Algorithm Using Dynamic Model. IEEE Computer 32(8), 68–75 (1999)CrossRefGoogle Scholar
  9. 9.
    Kundu, S.: Gravitational Clustering – A New Approach Based on the Spatial Distribution of the Points. Pattern Recognition 32(7), 1149–1160 (1999)CrossRefGoogle Scholar
  10. 10.
    Kuok, C.M., Fu, A.W., Wong, M.H.: Mining Fuzzy Association Rules in Databases. SIGMOD Record 27(1), 41–46 (1998)CrossRefGoogle Scholar
  11. 11.
    Lee, W., Stolfo, S., Mok, K.: Mining in a Data-flow Environment –: Experience in Network Intrusion Detection. In: 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 1999), pp. 114–124. ACM (1999)Google Scholar
  12. 12.
    Leon, E., Nasraoui, O., Gomez, J.: Scalable Evolutionary Clustering Algorithm with Self-Adaptive Genetic Operators. In: 2010 IEEE Congress on Evolutionary Computation (CEC 2010), pp. 4010–4017. IEEE (2010)Google Scholar
  13. 13.
    Liu, M., Jiang, X., Kot, A.C.: A Multi-Prototype Clustering Algorithm. Pattern Recognition 42(5), 689–698 (2009)MATHCrossRefGoogle Scholar
  14. 14.
    Luo, T., Zhong, C., Li, H., Sun, X.: A Multi-Prototype Clustering Algorithm Based on Minimum Spanning Tree. In: 7th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010), pp. 1602–1607. IEEE (2010)Google Scholar
  15. 15.
    MacQueen, J.: Some Methods for Classification and Analysis of Multivariate Observations. In: 5th Berkeley Symposium on Mathematics, Statistics, and Probabilities, pp. 281–297. University of California Press (1967)Google Scholar
  16. 16.
    Nasraoui, O., Krishnapuram, R.: A Novel Approach to Unsupervised Robust Clustering Using Genetic Niching. In: 9th IEEE International Conference on Fuzzy Systems (FUZZ IEEE 2000), vol. 1, pp. 170–175 (2000)Google Scholar
  17. 17.
    Nurnberger, A., Pedrycz, W., Kruse, R.: Data mining tasks and methods – Classification – Neural Network Approaches. In: Klosgen, W., Zytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery. Oxford University Press (2002)Google Scholar
  18. 18.
    Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. John Wiley & Sons (1987)Google Scholar
  19. 19.
    Wright, W.E.: Gravitational Clustering. Pattern Recognition 9(3), 151–166 (1977)CrossRefGoogle Scholar
  20. 20.
    Zhao, Y., Karypis, G.: Comparison of Agglomerative and Partitional Document Clustering Algorithms. In: SIAM Workshop on Clustering High-dimentional Data and Its Applications 2002, pp. 1–13 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Elizabeth León
    • 1
  • Jonatan Gómez
    • 1
  • Fabián Giraldo
    • 1
  1. 1.Computer Systems Engineering DepartmentUniversidad Nacional de ColombiaColombia

Personalised recommendations