α-Clusterable Sets

  • Gerasimos S. Antzoulatos
  • Michael N. Vrahatis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6911)


In spite of the increasing interest into clustering research within the last decades, a unified clustering theory that is independent of a particular algorithm, or underlying the data structure and even the objective function has not be formulated so far. In the paper at hand, we take the first steps towards a theoretical foundation of clustering, by proposing a new notion of “clusterability” of data sets based on the density of the data within a specific region. Specifically, we give a formal definition of what we call “α-clusterable” set and we utilize this notion to prove that the principles proposed in Kleinberg’s impossibility theorem for clustering [25], are consistent. We further propose an unsupervised clustering algorithm which is based on the notion of α-clusterable set. The proposed algorithm exploits the ability of the well known and widely used particle swarm optimization [31] to maximize the recently proposed window density function [38]. The obtained clustering quality is compared favorably to the corresponding clustering quality of various other well-known clustering algorithms.


Particle Swarm Optimisation Cluster Algorithm Particle Swarm Differential Evolution Dense Region 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abraham, A., Grosan, C., Ramos, V.: Swarm Intelligence in Data Mining. Springer, Heidelberg (2006)CrossRefzbMATHGoogle Scholar
  2. 2.
    Ackerman, M., Ben-David, S.: Measures of clustering quality: A working set of axioms for clustering. In: Advances in Neural Information Processing Systems (NIPS), pp. 121–128. MIT Press, Cambridge (2008)Google Scholar
  3. 3.
    Ackerman, M., Ben-David, S.: Clusterability: A theoretical study. Journal of Machine Learning Research - Proceedings Track 5, 1–8 (2009)Google Scholar
  4. 4.
    Alevizos, P.: An algorithm for orthogonal range search in d ≥ 3 dimensions. In: Proceedings of the 14th European Workshop on Computational Geometry (1998)Google Scholar
  5. 5.
    Alevizos, P., Boutsinas, B., Tasoulis, D.K., Vrahatis, M.N.: Improving the orthogonal range search k-windows algorithms. In: 14th IEEE International Conference on Tools and Artificial Intelligence, pp. 239–245 (2002)Google Scholar
  6. 6.
    Antzoulatos, G.S., Ikonomakis, F., Vrahatis, M.N.: Efficient unsupervisd clustering through intelligent optimization. In: Proceedings of the IASTED International Conference Artificial Intelligence and Soft Computing (ASC 2009), pp. 21–28 (2009)Google Scholar
  7. 7.
    Arabie, P., Hubert, L.: An overview of combinatorial data analysis. In: Clustering and Classification, pp. 5–64. World Scientific Publishing Co., Singapore (1996)CrossRefGoogle Scholar
  8. 8.
    Ball, G., Hall, D.: A clustering technique for summarizing multivariate data. Behavioral Sciences 12, 153–155 (1967)CrossRefGoogle Scholar
  9. 9.
    Berkhin, P.: Survey of data mining techniqes. Technical report, Accrue Software (2002)Google Scholar
  10. 10.
    Berry, M.J.A., Linoff, G.: Data mining techniques for marketing, sales and customer support. John Willey & Sons Inc., USA (1996)Google Scholar
  11. 11.
    Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)CrossRefzbMATHGoogle Scholar
  12. 12.
    Chen, C.Y., Ye, F.: Particle swarm optimization algorithm and its application to clustering analysis. In: IEEE International Conference on Networking, Sensing and Control, vol. 2, pp. 789–794 (2004)Google Scholar
  13. 13.
    Cohen, S.C.M., Castro, L.N.: Data clustering with particle swarms. In: IEEE Congress on Evolutionary Computation, CEC 2006, pp. 1792–1798 (2006)Google Scholar
  14. 14.
    Das, S., Abraham, A., Konar, A.: Automatic clustering using an improved differential evolution algorithm. IEEE Transactions on Systems, Man and Cybernetics 38, 218–237 (2008)CrossRefGoogle Scholar
  15. 15.
    Dubes, R.: Cluster Analysis and Related Issue. In: Handbook of Pattern Recognition and Computer Vision, pp. 3–32. World Scientific, Singapore (1993)CrossRefGoogle Scholar
  16. 16.
    Engelbrecht, A.P.: Computational Intelligence: An Introduction. John Wiley & Sons, Ltd., Chichester (2007)CrossRefGoogle Scholar
  17. 17.
    Epter, S., Krishnamoorthy, M., Zaki, M.: Clusterability detection and initial sees selection in large datasets. Technical Report 99-6, Rensselaer Polytechnic Institute, Computer Science Dept. (1999)Google Scholar
  18. 18.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226–231 (1996)Google Scholar
  19. 19.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, San Francisco (2006)zbMATHGoogle Scholar
  20. 20.
    Jain, A.K., Dubes, R.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)zbMATHGoogle Scholar
  21. 21.
    Jain, A.K., Flynn, P.J.: Image segmentation using clustering. In: Advances in Image Understanding: A Festschrift for Azriel Rosenfeld, pp. 65–83. Willey - IEEE Computer Society Press, Singapore (1996)Google Scholar
  22. 22.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31, 264–323 (1999)CrossRefGoogle Scholar
  23. 23.
    Kennedy, J., Eberhart, R.C.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)Google Scholar
  24. 24.
    Kennedy, J., Eberhart, R.C.: Swarm Intelligence. Morgan Kaufmann Publishers, San Francisco (2001)Google Scholar
  25. 25.
    Kleinberg, J.: An impossibility theorem for clustering. In: Advances in Neural Information Processing Systems (NIPS), pp. 446–453. MIT Press, Cambridge (2002)Google Scholar
  26. 26.
    Lisi, F., Corazza, M.: Clustering financial data for mutual fund managment. In: Mathematical and Statistical Methods in Insurance and Finance, pp. 157–164. Springer, Milan (2007)Google Scholar
  27. 27.
    MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)Google Scholar
  28. 28.
    Ng, R., Han, J.: CLARANS: A method for clustering objects for spatial data mining. IEEE Transactions on Knowledge and Data Engineering 14(5), 1003–1016 (2002)CrossRefGoogle Scholar
  29. 29.
    Omran, M.G.H., Engelbrecht, A.P.: Self-adaptive differential evolution methods for unsupervised image classification. In: Proceedings of IEEE Conference on Cybernetics and Intelligent Systems, pp. 1–6 (2006)Google Scholar
  30. 30.
    Ostrovsky, R., Rabani, Y., Schulman, L.J., Swamy, S.: The effectiveness of lloyd-type methods for the k-means problem. In: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pp. 165–176. IEEE Computer Society, Washington, DC (2006)Google Scholar
  31. 31.
    Parsopoulos, K.E., Vrahatis, M.N.: Particle Swarm Optimization and Intelligence: Advances and Applications. Information Science Publishing (IGI Global), Hershey (2010)CrossRefzbMATHGoogle Scholar
  32. 32.
    Paterlini, S., Krink, T.: Differential evolution and particle swarm optimisation in partitional clustering. Computational Statistics & Data Analysis 50, 1220–1247 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Pavlidis, N., Plagianakos, V.P., Tasoulis, D.K., Vrahatis, M.N.: Financial forecasting through unsupervised clustering and neural networks. Operations Research - An International Journal 6(2), 103–127 (2006)zbMATHGoogle Scholar
  34. 34.
    Preparata, F., Shamos, M.: Computational Geometry: An Introduction. Springer, New York (1985)CrossRefzbMATHGoogle Scholar
  35. 35.
    Puzicha, J., Hofmann, T., Buhmann, J.: A theory of proximity based clustering: Structure detection by optimisation. Pattern Recognition 33, 617–634 (2000)CrossRefGoogle Scholar
  36. 36.
  37. 37.
    Tasoulis, D.K., Plagianakos, V.P., Vrahatis, M.N.: Unsupervised clustering in mRNA expresion profiles. Computers in Biology and Medicine 36, 1126–1142 (2006)CrossRefGoogle Scholar
  38. 38.
    Tasoulis, D.K., Vrahatis, M.N.: The new window density function for efficient evolutionary unsupervised clustering. In: IEEE Congress on Evolutionary Computation, CEC 2005, vol. 3, pp. 2388–2394. IEEE Press, Los Alamitos (2005)CrossRefGoogle Scholar
  39. 39.
    Theodoridis, S., Koutroubas, K.: Pattern Recognition. Academic Press, London (1999)Google Scholar
  40. 40.
    van der Merwe, D.W., Engelbrecht, A.P.: Data clustering using particle swarm optimization. In: Proceedings of the 2003 IEEE Congress on Evolutionary Computation, pp. 215–220 (2003)Google Scholar
  41. 41.
    Vrahatis, M.N., Boutsinas, B., Alevizos, P., Pavlides, G.: The new k-windows algorithm for improving the k-means clustering algorithm. Journal of Complexity 18, 375–391 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Xiong, H., Wu, J., Chen, J.: K-means clustering versus validation measures: A data-distribution perspective. IEEE Transactions on Systems, Man and Cybernetics - Part B: Cybernetics 39(2), 318–331 (2009)CrossRefGoogle Scholar
  43. 43.
    Zhao, Y., Karypis, G.: Criterion Functions for Clustering on High-Dimensional Data. In: Grouping Multidimensional Data Recent Advances in Clustering, pp. 211–237. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gerasimos S. Antzoulatos
    • 1
  • Michael N. Vrahatis
    • 1
  1. 1.Computational Intelligence Laboratory (CILAB) Department of MathematicsUniversity of Patras Artificial Intelligence Research Center (UPAIRC) University of PatrasPatrasGreece

Personalised recommendations